WordPress.org user, “definitelynot” discovered a bug in the WordPress plugin, WP Super Cache that could expose blogs to duplicate content penalties. Unfortunately this affects every blog that uses the plugin in “ON” or full “Super Cache” mode, and has URLs that end with the “/” (forward slash) character. If the plugin is on “half on” mode, you’ll be fine.
The problem is that an anonymous user might visit a legitimate URL, ending with a slash, the plugin then creates a static file out of that page, which is then used when people visit the same URL. Unfortunately if someone links to that URL without the ending slash, a visiting browser or search engine bot won’t be redirected to the proper URL, they’ll be served the static html file.
For example:
- John visits the URL /2007/05/23/why-the-nurses-cant-go-on-strike/ on my site. WP Super Cache creates a html file of that page.
- In his enthusiasm for that post, John publishes a post about those zany doctors, but he forgets the ending “/”.
- Googlebot, seeing fresh content on John’s site, crawls it and sees the link, visits my site eventually and wonders why it’s seeing the exact same page at two different URLs.
To be fair, Google is pretty good at figuring out where duplicate content is supposed to go but it’s better to avoid the issue completely. It also only matters if there are links to your site without the ending slash. The most common will probably be to your homepage as it’s likely internal URLs will be copy/pasted.
How to Fix
You should update to version 0.7 of the plugin which checks if your blog is affected by this problem. It also has instructions for updating the mod_rewrite rules in your .htaccess. It’s fairly easy to fix. Thank you “andylav” for the mod rewrite magic!
- Edit the .htaccess in the root of your WordPress install.
- You’ll see two groups of rules that look like this:
RewriteCond %{REQUEST_METHOD} !=POST RewriteCond %{QUERY_STRING} !.*s=.* RewriteCond %{QUERY_STRING} !.*wp-subscription-manager=.* RewriteCond %{QUERY_STRING} !.*attachment_id=.* RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$ RewriteCond %{HTTP:Accept-Encoding} .*gzip.* RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz -f RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz [L] RewriteCond %{REQUEST_METHOD} !=POST RewriteCond %{QUERY_STRING} !.*s=.* RewriteCond %{QUERY_STRING} !.*wp-subscription-manager=.* RewriteCond %{QUERY_STRING} !.*attachment_id=.* RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$ RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html -f RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html [L]
- You need to add the following 2 rules above each block of “RewriteCond” lines:
RewriteCond %{REQUEST_URI} !^.*[^/]$ RewriteCond %{REQUEST_URI} !^.*//.*$
- The rules should eventually look like this:
RewriteCond %{REQUEST_URI} !^.*[^/]$ RewriteCond %{REQUEST_URI} !^.*//.*$ RewriteCond %{REQUEST_METHOD} !=POST RewriteCond %{QUERY_STRING} !.*s=.* RewriteCond %{QUERY_STRING} !.*wp-subscription-manager=.* RewriteCond %{QUERY_STRING} !.*attachment_id=.* RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$ RewriteCond %{HTTP:Accept-Encoding} .*gzip.* RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz -f RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html.gz [L] RewriteCond %{REQUEST_URI} !^.*[^/]$ RewriteCond %{REQUEST_URI} !^.*//.*$ RewriteCond %{REQUEST_METHOD} !=POST RewriteCond %{QUERY_STRING} !.*s=.* RewriteCond %{QUERY_STRING} !.*wp-subscription-manager=.* RewriteCond %{QUERY_STRING} !.*attachment_id=.* RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$ RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html -f RewriteRule ^(.*) /wp-content/cache/supercache/%{HTTP_HOST}/$1/index.html [L]
- Or you could just delete those rules and let the plugin regenerate them for you again.
PS. Thanks also to Lloyd for noticing the “enable the plugin” link was pointing at the wrong URL, and to Ryan who spotted a minor problem with the admin page and was kind enough to send me a Tweet about it.
PPS. I’ve just tagged 0.7.1 to fix some problems with the updating of the .htaccess, mainly for new users. If 0.7 of the plugin works for you, there’s no need to upgrade!