Bye bye Referer Spammers!

Take a quick look at your logfiles any time and you’re likely to see referer spam in there somewhere. Not only do those requests pollute your log files and stats pages, but they also consume resources on your server when you serve them pages that aren’t even going to be viewed by anyone. Here’s one way of stopping the spammers eating into your server resources:

  • Look through your logfiles and examine the referers. Here’s a quick bit of code to do that. (Remove the backslashes (“\”) from before double quotes. WP is putting them in on me!) :
    awk '{print $11}' < /var/log/apache2/access_log| sort|uniq -c|sort -r|grep -v "mydomain.com"|less
  • Copy and paste any likely looking referer spam sites somewhere else for safe keeping. The ones that use most of your resources will be at the top of the list.
  • Add this code to some page that every page on your site loads, it should be included before main execution of the page occurs. Fill in the array of referer sites with the list your assembled from your log file. I’ve added a few from this morning’s log file.
    if( isset( $_SERVER["HTTP_REFERER"]  ) )
    {
        $referers_to_avoid = array(
                "ttp://texas-holdem.andrewsaluk.com",
                "ttp://www.highprofitclub.com/",
                "ttp://www.sex4singles.com/",
                "ttp://www.parishillton.com/",
                "ttp://www.moneylinebet.com/",
                "ttp://www.free-hentai-anime-sex.com",
                "ttp://www.bondage-bdsm.us",
                "ttp://www.handjob-movies.us",
                "ttp://www.zoothumbnails.com",
                "ttp://www.bestiality-animal-sex-stories.com",
                "ttp://www.gay-men-sex-movies.com",
                "ttp://russ-darrow-kia.gq.nu/",
                "ttp://nissan-xterra.sbn.bz/",
                "ttp://nissan-thermos.gq.nu/",
                "ttp://folding-chair.wol.bz/",
                "ttp://www.xcites-0-cost-interracial-cum-teen-sex-movie.com"
        );
        while( list( $key, $val ) = each( $referers_to_avoid ) )
        {
            if( strpos( $_SERVER["HTTP_REFERER"], $val ) )
            {
                die();
            }
    
        }
    }
  • Add an error_log() to the “if” condition to spot when a spammer visits.
  • Add this to index.php of a WordPress installation to protect your blog and make your legitimate requests go that much faster!

WordPress Multiuser, WP1.5 Sync

There’s a new snapshot available now! In brief, changes include increased use of PEAR Cache, updated Smarty install, referer listings hide direct and internal requests by default. Updated Kitten’s Spaminator and WordPress code of course!
Go download it now and use the support forums if you have a question!
Later… I updated and packaged my collection of WPMU themes again and they’re available from the download page!

What's the GPL? WordPress and PEAR Cache Problems

Ben Ramsey explores some of the issues when you write GPLed code that uses code from the PEAR library.
I had forgotten about the differing licenses used by PEAR and WordPress. They’re unfortunately incompatible and you can’t ship PHP licensed code in a GPL project without an “exception clause” in your GPL license. A change to the license of WordPress would require the agreement of *all* copyright holders of code in the project AFAIK.
Thankfully, I don’t ship PEAR Cache with WordPress MU. I use it if it’s installed already, WPMU isn’t dependant on PEAR Cache being available to work.
I think that gets around the incompatibility. Doesn’t it?

Google "nofollow"

Well, Google’s nofollow attribute is one way of putting off comment-spammers but it won’t stop them. They’ll continue to spam in case they come across a site that doesn’t support the new attribute.
I may look at adding rel=”nofollow” to links here, but it’d be handy to have a list of “safe” URLs that are safe to link to. Perhaps a WP plugin, backend interface and db table?
In other news, the number of spams getting through to the moderation queue has dwindled down to zero (besides the edgesaver one of course!) so it’s not a problem here right now.

PEAR Output Cache and WPMU

One of those things PHP doesn’t do well is load large libraries of scripts into memory and parse them quickly. WordPress and Smarty fall firmly into this category and the caching I’ve done so far has only addressed half of this problem: cached pages don’t load any WordPress code, but the heavy Smarty library was still loaded.

I’ve now modified my code to use the PEAR Output Cache and that avoids loading the Smarty templating system. On a heavily loaded server this should make quite a difference to visitors. Things still need tweaking, I may need to introduce a “time” aspect to the cache key as pages could be cached indefinitely but for now it seems to be working really well!
Note to self, please remember that gzip compression upsets PHP’s output buffer. Note to everyone else, don’t set “gzip compression” in the backend or your blog will be foobarred! (Thanks Mel for testing his blog and reporting problems with it!)

WPMU Registration Page – alpha quality

Open this file in your browser, download and copy into your wp-inst/ folder as “wp-newblog.php”
Create the following entry in your root .htaccess file too (modify to suit your site):
RewriteRule ^([_0-9a-z-]+)/wp-newblog.php(.*) /wp-inst/wp-newblog.php [L]
Call it by going to http:// example.com/main/wp-newblog.php
It’s a very simplistic script, there’s no error checking, it creates a new directory before handing over control to the WordPress install.php
Give it a whirl but when you’re finished be sure to delete it or move it out of the way!

WordPress Multiuser Snapshot

I decided to release a new snapshot of WPMU today because of a problem posting comments. There’s also a few more changes, they’re listed below. Go download it and play!

  • The referer plugin now uses PEAR Cache. This should speed up pages significantly as MySQL won’t have to lock so many reads when the table is updated!
  • Kitten’s Spaminator has been upgraded to the latest version. This version has a very nice admin page to configure it without editing the plugin file itself.
  • A bug in comment posting stopped them getting through, fixed.
  • To speed things up, template compile checks have been turned off. If a template changes make sure you do it from the backend.
  • Misc bug fixes and WP upgrades.

Later I looked into the sql errors people were having during the install and thanks to Chuck I think I figured out what happened: If you’re getting SQL errors about table names with “wp-inst” in them make sure you’re not calling setup-config.php or install.php using the urls http://example.com/wp-inst/wp-admin/setup-config.php and http://example.com/wp-inst/wp-admin/install.php
The correct urls should be http://example.com/main/wp-admin/setup-config.php and http://example.com/main/wp-admin/install.php
The “main” part of the url is handled by mod_rewrite. Follow the links in the installer and you’ll be fine!
I’m going to update the installer so it knows to replace “wp-inst” with “main” when people call the installer incorrectly.

Here are the URLs you’ll be calling when you install WPMU, the first page is a WPMU page, but the rest is bog-standard WordPress installer stuff:

Here are the links I go to on my local machine:
http://localhost/ – creates directories, page starts with text, “Welcome to WordPress MU, the Multi User Weblog System built on WordPress.
You’re probably seeing this message because you’re installing WPMU!”

The link at the bottom of that page points at this page:
http://localhost/main/wp-admin/setup-config.php
This is almost the same file that exists in WordPress itself.

Hit “let’s go!” at the bottom of that page to go to:
http://localhost/main/wp-admin/setup-config.php?step=1
Fill in the form there and go to:
http://localhost/main/wp-admin/setup-config.php?step=2
where you’re asked to “run the install!”

That goes to
http://localhost/main/wp-admin/install.php
and clicking on “First Step” goes to
http://donncha.homelinux.net/main/wp-admin/install.php?step=1
to fill in weblog title and email address.

After submitting that form I get a login link and username/password to use when logging in at http://localhost/main/wp-login.php

I also fixed the table prefix bug too, but that’s not in this release. It’ll be in the next one!