Styx - Blocking SPAM

Blocking SPAM with the “Spam Protector” plugin

What is this plugin?

Serendipity ships with a default event plugin called “Spam Protector” that is used to fight Blog SPAM.

This plugin is also installed by default in your Serendipity installation. You can find it’s configuration by going to the Plugin Configuration menu in your Admin Interface and clicking on the title “Spam Protector” of the listing of event plugins.

The “Spam Protector” will check each comment and trackback that is made on your Blog; Either via API (wfwComment, Trackback) or via your using Blog interface (extended entry, comment popup).

Each comment will be analyzed through different, configurable facilities. If any configured option qualifies the comment as SPAM, it will be set to “moderation” or “rejected” state. Moderated comments will show up in your Admin section “Comments”, where you can either remove the comment or activate the comment to be seen on your Blog. If configured, you will also receive a mail about moderated comments.

To ideally keep your Blog free from SPAM, there are a lot of individual options which you will learn to configure in the next section.

Configuring the plugin

Emergency comment shutdown

If you enable this option, ALL comments and trackbacks to your Blog will be disabled. Nobody can then leave a comment. It is as if you checked the “Do not allow comments to this entry” checkbox for each of your entries.

This option is meant to be used when you might go on holiday and want no SPAM to appear on your Blog.

Disable spamblock for Authors

The anti-spam methods can be bypassed for registered and logged-in authors. Here you can choose which author group is allowed to bypass them. “All authors” means that every registered author can bypass it, but you could also specify that bypassing is only allowed for Administrators. “none” means that authors are treated just the same as anonymous comments.

Do not allow duplicate comments

If enabled, a comment that already exists in the database with the same body will be rejected. This option is very helpful to disallow mass-spam with the same contents.

Reject comments which only contain known text

Sometimes comment SPAM used just the same content than your entry title contained. If you enable this option, SPAM like this will be rejected.

IP block interval

This setting forbids comments from the same IP within a specific timespan. You should be aware that different users might have the same IP if they are using a proxy. AOL users, for instance, often are forced to use the same proxy and then could run into problems if other AOL users comment the same time on your Blog. For low-volume comment Blogs though, this option is helpful to prevent mass-attacks.

Forbid direct comments (XSRF protection)

If enabled, visitors are not allowed to submit a comment when visiting your articles directly. This can block spambots, but also people who are commenting from their RSS readers or who have cookies disabled. This protection is implemented by setting a special hash field, which will only exist when a valid session was already started. This will also protect you from XSRF attacks that could trick you into submitting comments under false pretenses.

Enable Captchas

Captchas are currently the most probate action you can take to prevent comment SPAM. Captchas are tests to check that a real human user is performing an action. In terms of Serendipity, this means that the commenting user will see a crypted image and needs to enter which characters he sees. Until now, spambots are not able to do the same task.

Captchas have three major issues:

  1. Sometimes they are even hard to read for “real” humans, and they might annoy people. Captchas are very hard to read for impaired users, and are not accessible to blind people.
  2. Captchas will be cracked in a matter of time, forcing either stronger Captchas or different approaches to spamfighting. There already exist algorithms that crack available variations of other system’s Captchas.
  3. Captchas can only be enabled for COMMENTS. Since trackbacks are machine generated by definition, they cannot use the Captcha image to be verified. This means that even though enabled Captchas for your users will block virtually all comment SPAM, trackbacks still can only be blocked by different means.

The option “scrambled captchas” will use a stronger Captcha implementation with varying pixels scattered within the Captcha to make them harder to read for automated bots.

Force Captchas after X days

With this option you can enter the amount of days an article of you needs to age until Captchas are enabled there. Usually spambots catch up your new articles only after some days, which means that fresh articles on your Blog are less exposed to SPAM. Since the usual Blog comments only happen within a short time after publishing an article, it is a good idea to enforce Captchas when an article is, say 7 days old. This will then not annoy valid users commenting on your fresh articles, but later hinder spambots to drop their garbage.

Background color of the Captcha

Here you can customize the background color of your auto-generated Captcha image.

Time frame for comments within X days

With Serendipity Styx 2.6 and up the comment function of an article may only globally be allowed for a limited period of X days starting from the article date. This might be a valid request to make sure your Blog is not flooded with bogus comments for old entries. The default value is "0" and allows comments to any existing article without age limit. Under normal conditions your entries usually don't get very much valid comments after a certain amount of time.

Force comment moderation after X days

You can enable the comment moderation for each trackback and comment to an article here. Enter the amount of days after which an article will be put in “moderation” state.

Shall this time frame block later trackbacks too?

Block and reject potentially valid Trackbacks/Pingbacks, which income to entries after this time frame closed. (Other option actions may influence this behaviour.)

Comments classification after being auto-moderated?

Here you can set what happens to auto-moderated articles from the option above. Rejected comments will not be stored anywhere, and moderated comments you can later delete or toggle visible. You will only receive mails about moderated comments, and you will not be notified of rejected comments.

Force trackback moderation after X days

As a distinction to general comment moderation, you can specifically only block trackbacks to older entries. This way you can leave normal (captcha’d) comments to pass through, while trackbacks could be generally dismissed.

Classification after being auto-moderated?

Just like the global auto-moderation, you can specify what happens to auto-moderated trackbacks.

How to treat comments made via APIs

Here you can indicate what happens to all API-made comments (Trackbacks, wfwComment). You can either globally moderate them or reject them, or with the “none” option you can handle them not individually.

Check trackback URLs

This option will call the URLs that are given in a trackback and see if the URL of your entry that they refer to really shows up on their page. This option can significantly decrease trackback SPAM, but will also perform URL requests on your server to foreign servers, which both takes time and traffic.

Here you can specify the maximum amount of links (http://…) that are allowed within a comment until it gets auto-moderated.

Here you can specify the maximum amount of links (http://…) that are allowed within a comment until it gets rejected.

Activate wordfilter

This will enable the wordfilter for URLs, author names and the comment body. If you set the option to “moderate”, all comments/trackbacks that contain a word of the following filters will be moderated. If you set it to “reject”, they will be completely rejected and you will get no notification about it.

Wordfilter for URLs, author names, comment body

In these large textareas you can enter a “;” delimited number of regular expressions. If you just want to block names like “casino”, “phentermine” etc. you can simply enter the words. But regular expressions would also allow you a broader range of filtering. See more about regular expressions here: http://en.wikipedia.org/wiki/Regular_expression

The most important rule is that in Regular Expressions there are a few special characters: “.” means “any character”, and “.*” would mean “any ammount of any character”.

You can separate each rule of the wordfilters also with a “;” plus linebreak, if that is easier for you to read.

Activate URL filtering by blogg.de Blacklist

The blogg.de blacklist contains a list of “bad” URL names that spambots enter as their homepage. The blogg.de blacklist is well maintained and should catch on bad URLs pretty soon. You can decide whether to “moderate” these comments that are marked as bad by the blogg.de blacklist, or to immediately “reject” them.

Akismet API KEY

Akismet.com is a central anti-spam and blacklisting server. It can analyze your incoming comments and check if that comment has been listed as Spam. Akismet was developed for Word Press specifically, but can be used by other systems. You just need an API Key from http://www.akismet.com/ by registering an account at http://www.wordpress.com/. If you leave this API key empty, Akismet will not be used.

Akismet will inspect submitted SPAM according to their central blacklist and decide if it is “bad” or “good”. The Akismet API also supports to submit uncaught SPAM, but this is not yet supported by this Serendipity plugin.

How to treat Akismet-reported SPAM

Set here what to do with comments/Trackback that Akismet reports as SPAM.

Hide E-Mail addresses of commenting users

If this option is enabled, the email addresses of users that have commented on your Blog will not be displayed on your Blog. This option can also be set from within the general Serendipity configuration as well.

Check e-mail addresses

If set to “Yes”, email addresses will be checked for valid syntax. This prevents people from entering “nothing” or other invalid things as their email addresses.

Required comment fields

Here you can enter which comment form fields you want your users to fill out. This way you can force people to leave their email address, for example.

Choose logging method

One very important thing in blocking SPAM is to check, WHAT you are blocking. Since you are only notified of moderated SPAM by email, you might want to check what kind of SPAM you reject every day. For that you can enable this option and either set it to “Database” or “File”. The preferred way is to use database-based logging, because then you can check the “serendipity_spamblocklog” DB table with tools like phpMyAdmin easily. There you will also see rejection reasons!

Logfile location

If you chose to log to a File (maybe because checking the database is too cumbersome for you), you need to specify the file location of the logfile you want to create here. Make sure you enter the full and absolute path to your logfile, and that your webserver can write to that file.

Was that all?

Since SPAM adapts nearly daily, the best thing to keep SPAM low is to check your wordfilters, url filters and author name filters from time to time, and add new common URLs or Author names to your blacklist.

Adapting SPAM from the Comment Moderation Panel

In your comment moderation panel, you have the option to immediately configure Anti-Spam measurements from the top of that panel.

Also you will see a wrench-icon near author names and URLs, which will put those names/URLS in the spamblock wordfilter automatically.

Questions?

Please read the important FAQ SPAM fighting notes about the SPAM plugin trinity over here.

If you have further questions about blocking SPAM, or have some ideas on how to improve fighting SPAM, please drop by on S9y-Origin Forums!