Categories
WordPress

Preload the cache in WP Super Cache

See that nice dip in the graph for this week? I started to preload the cache used by WP Super Cache last Sunday and it’s made a noticeable difference in the load on my server here. The big spike is the preloading process.

I’ve always discouraged users from preloading the cache (Askapache Crazy Cache will do this for any cache plugin), mainly because of the possible problems so many files will cause for hosting companies. If you have thousands of cache files, it’s going to take so much longer to recover from a disk crash.
On the other hand, Google will now be using speed as a metric for judging how “good” a website is. In the past this plugin ignored the pages visited by bots because the bots only visited each page once so caching a page after the fact was pointless. The page, all pages, have to be cached first before Google ever visits.

That’s what it looks like. Once you start preloading it launches a wp-cron job to fetch 100 posts, then schedules another job 10 seconds in the future to fetch another 100 posts until it finishes. It also disables garbage collection of old pages, but making comments or posts will still clear out the appropriate cached files.
It only caches single posts right now. It may not be worth caching archive or tag pages because many sites already tell bots to ignore those pages as the server is doing less work it will serve those archive pages more quickly anyway.

The preloading only works if you’re using the plugin in Supercache or “ON” mode. It’s still a work in progress but has worked fine here. As well as the preloader the development version of the plugin has:

  1. Better support for mobile plugins.
  2. A cache tester.
  3. Can be configured to only delete the page a comment is left on, rather than the front page and associated pages.
  4. Works in WordPress 3.0.

It also has a number of bug fixes and other features added too.

I need testers though, so grab the development version from the download page. Install it and please leave feedback here or preferably on the support forum.

Comments

comments

By Donncha

Donncha Ó Caoimh is a software developer at Automattic and WordPress plugin developer. He posts photos at In Photos and can also be found on Google+ and Twitter.

167 replies on “Preload the cache in WP Super Cache”

great to hear. installed it on about 5 blogs now.. preloading seems fine, nice feature for sure (google..) – but on shared hosting? Mh.

the new preload developer version seems fine – no bugs. even its running great under wordpress 3.0 like you wrote it up 🙂

thank you very much for this, donncha!

I have a slow site, so lets say that I can not load 100 pages in 10 seconds, will I get a huge backlog of cron jobs asking for another 100 pages?

It will only schedule the next job when the current 100 pages are done so it doesn’t matter how long it takes. (As long as the process itself doesn’t timeout!)

How long is a piece of string? Depends on your blog, how many posts you have, and how good your hosting provider is. I already talked to the guys at http://blacknight.ie/ and they were enthusiastic about the idea of precaching for performance reasons. If your hosting isn’t good enough, check them out! (and note the lack of affiliate codes on that link, I’m not earning a commission mentioning them!)

Okay, now I have a verbiage question!

Cache every page on your site. This will also disable garbage collection but that can be enabled later by setting a non-zero expiry time on this page.

My thought is ‘Okay, so once I’ve pre-cached, I go BACK down to the Expiry Time & Garbage Collection section and put my time in there.’ But then there are two places for expiry time: the Pre-Cache section and the Garbage collection. Am I right in assuming these are entirely independent of each other?

In the back of my head is the thought that combining these two, into a pre-cache and keeping the regular expiration, would work well. Only re-run the massive pre-cache once every couple months, and in the meantime, have your regular garbage collection going on every, say, 15-30 days? I have a feeling my brain is going about this the wrong way.

The preload time is a refresh when the cached pages are all cleared out and refreshed in one go. That’s completely different to the GC expiry time that clears out old files but leaves new ones.

What needs to happen is garbage collection on half-on files that are generated by people who have commented on the site or are logged in. They’re usually files that can only be used by one user so there’s no point leaving them there for ages.

Great idea! Thanks!

Could you please tell me what exactly is the difference between the two expire times? I want the posts to stick forever. And the front page to expire every 15 minutes. Is that doable?

Is there a way to tell if it working or not? Do I need to log out as administrator. Can I look at the HTML?

Please see this comment. It isn’t possible to do what you want out of the box but it would be simple to code a plugin that deleted the supercache files for the homepage. Just delete wp-content/cache/supercache/hostname/index.html (and .gz) every 15 minutes …

Let’s say that I delete the supercache files for the homepage every 15 minutes. Should I set the other expire times to 0? I’m still confused on the difference between them.

I will suggest U provide few more settings for PreLoading.

Like How Many Pages to Pre Load. (Old Posts are Actually Never Accessed so frequently. and Even if they are accessed, chances are they are already Cached by previous request).

Flexibility shuld be provided to us as to what range of pages we need to Preload.
Something of that sought.
With my 20000 Posts on one of the ten blogs I have, my powerfull dedicated server will soon see its space gone ZERO if I enable Preload.

I haven’t used your plugin in a while. But now I’ve tried it again, with the new options, and it seems to be working fine!

I had some trouble at first, because even though I unchecked the option “Don’t cache pages for known users” I could not see the cached files. I even tried logging out of my WP blog, but still wasn’t able to see the cached files. I then discovered that even though I was logged out, I still had the cookie “wordpress_test_cookie” in my browser. “wordpress_test_cookie” matches this line in the .htaccess file

RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$

Maby you could change this line to

RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress_logged_in|wp-postpass_).*$

I just logged in into my blog checked at the settings again. It still says “Currently caching from post 200 to 300”. But I have more than 4000 posts…

It seems to have stopped (actually I’ve confirmed it by looking in the supercache directory).

Is it possible to make it continue?

Maby my value is too high? “Refresh cache every 525600 minutes. (0 to disable)”

Hmm, actually, it’s quite hard to figure out if the cron job is running because if you set an option as a flag, the blog may have an object cache in which case that option won’t be set until after the cron job has finished (and a new job scheduled). I had to check for this type of thing happening in my Tweet Tweet plugin and made the plugin insert a flag directly in the options table so I could be sure it wouldn’t be cached.

Might need to do the same with this unfortunately.

I clicked “Save” again to start the cron job again. But then my previous cache was deleted… So now I’m back on about 10 cached posts.

Do I have to stay logged in for the cron job to continue?

No, the cron job fires when someone visits your site. Ironically, it’ll fire less often the more of your site is preloaded! As long as people are leaving comments on your site and “known users” are looking around the cron jobs will (or should) fire.

I’ve just checked in code that will allow you to limit the number of posts cached, and added a “preload mode” where only wp-cache files are cleaned up. The dev version will update in about 15 minutes.

Hi Donncha,

I transferred a WP site to a new server with WP super cache on it.
~Zip the entire directory
~Unzipped to the new server
~reconfigured database connection
~activating plugins [issues started]

–> Now the issue is my posts went blank after activating WP super cache
(2nd attempt)
~ Deactivated all plugins [posts came back]
~ Activated WishList Member [posts are okay]
~ Activated WP super cache [posts went blank again]

(3rd attempt)
~ Deactivated all plugins [posts came back]
~ Activated WP super cache [posts are okay]
~ Activated WishList Member [posts went blank again]

My Question now is, is there a compatibility issue between the two plugins? Given the fact that it was working previously on my old server?

Francis – try uninstalling WP Super Cache. I bet the path in wp-content/advanced-cache.php is wrong and if you installed the plugin a long time ago that file isn’t as resilient to change as it is now. Make sure to follow the instructions in the readme.txt and then install it again.

Donncha,

I have done == If all else fails and your site is broken == 😛

Installed the latest version of WP-SC and done the same steps. I still have the same issue… Do you think is it something to do with permalinks? Do you have more suggestions? 😀

BTW, Thanks for quick response I really appreciate it.

I don’t know, check your error log for PHP errors. It does look like there’s some incompatibility there but I’ve never used the WishList plugin so I don’t know. Sorry.

sorry, ignore my comment. I still had the old rules inside .htaccess that is why it wasn’t showing me what to insert 🙁 – once I deleted the old rules, your plugin was showing me the new rules to insert

just curios about one thing: you built the preloading in ’cause google will now be using speed as a metric, then how come, google doesn’t get a cached page? see this excerpt from one of my logging emails:

USER AGENT (Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)) rejected. Not Caching

also experiencing a weird behaviour: as I have been playing around with the settigns a couple of times, including preloaded the cache, etc. I noticed that the Expiry Time & Garbage Collection resets itself to 0 from time to time…

The rejected user agent is a throwback to when the plugin didn’t cache requests made by bots. Bots visit a page once and never go back to the same page (within a reasonable amount of time) so there was no point caching it. I’ll probably remove the bots from the rejected UA list for the next release just because it’s now important for bots to see fast page loads now as well.

The GC will reset itself to 0 if you’re using the code from yesterday but I made a few changes this morning. Grab the latest dev code!

MU – it’ll only preload the current blog. You’ll have to go into each blog’s admin page to reload each. I hadn’t thought of that, so thank you. Not sure if I’ll add a feature to cache all blogs though.

@Donncha: thx. also playing around with the debugging, not sure how helpful this message is:

URI rejected. Not Caching

how about also including the rejected URL? makes the message more useful 🙂

I downloaded the DEv version ~1h ago.. strange.. will simply update it again.

@Donncha: just thought about soem mroe feedback:

– the list showing the latest cached pages, is also limited to the main blog on a wpmu installation.
– a bit confused by GC:

Garbage Collection
Last GC was 11:41 minutes ago
Next GC in 58:19 minutes

Expired files are files older than 7200 seconds. They are still used by the plugin and are deleted periodically.

The explanation below says that if expiry time is above 1800secs GC is done every 10 minutes, how come mine says 60mins!?

Hmm. I have the “same” problem as some people have commented about here earlier.
The additional cronjobs doesn’t get done. I get 100 cached pages (being ~50 of my last post and ~50 of my first posts, probably because the wordpress “import” done somewhere along the line…).

Is it this plugin that has problems or should I hunt deeper into other plugins? (I know lifestream have some problems with the cronjobs itself, but do not know if it causes other cronjobs to fail).

The original implementation of this loaded the posts by reloading the admin page with a counter. You had to stay on the admin page as it reloaded using Javascript. It doesn’t look as nice but might be needed if there are problems with cron. 🙁

We struggled with Super Cache on a sports site with 30k pageviews/day. Endless fiddling and tweaking didn’t bring satisfactory results. The frontend was fast in all browsers except Firefox. The backend crawled resulting in endless complaints from editors. How come the admin panel is not cached?

We switched to wp-file-cache. Easy setup. No fiddling. Bingo! Great results in all browsers with both the frontend and backend.

Robert – there’s no exception or code in the plugin to exclude Firefox so you might have been logged in on that browser and excluding caching from “known users” perhaps? The plugin is a full page caching plugin and not appropriate for the backend which is supposed to be dynamic. If the backend was crawling you should install an object cache. Use Google, there are plenty of different ones.

a nice feature (too all the other great stuff) would be a little db caching (queries) – maybe even in the dashboard like db chache or w3 total cache make it?

the new wp super cache with preloading runs still fine on my 5 blogs 🙂 thx!

Well. After some testing I have realized my description of the “cron” problem is not as I first said.
What is does is that it says: Caching page 100-200 (sic!)
And after a 100 post/pages is caches it then says an time for next job (a time which I don’t know if it ‘s actually happened yet or not, as I’m not totally sure of the time of the server…)
And then it clears the cache and begins caching the first 100 posts/pages again (once again stating it’s page 100-200…)

Okay. My last comment for today:
After some research (echo date() and more ^^) this is what actually happens to me:
I click the button to pre-load my cache. It caches my last ~50 posts/pages, then ~50 from a few years ago (not the oldest ones) of my total ~300 posts/pages (stating it caches post “100 to 200”. Then it sets a cron timer to pre-load “everything”(i.e. the same 100 posts/pages) again the after the specified time in the future, that is even if the timer is set to 0 (off) the cron timer is set to [start-time of past pre-load + 10 seconds].

I’m on MU if that matters.
And oh, this plugin is not very important to me, I just tought I’d try it out and now when it doesn’t work I just thought I’d tell you about it. So if you can’t find/fix the error, I’ll accept that it maybe is “only me” and disable the plugin 😉

This isn’t related to the development version. I have a question about the difference between WP Cache and WP Super Cache. I’ve enabled “ON WP Cache and Super Cache enabled”.

The Cache Contents tells me:
WP-Cache (58.48KB)
13 Cached Pages
0 Expired Pages
WP-Super-Cache (37.85MB)
1882 Cached Pages
0 Expired Pages

Can you explain why 13 pages are in WP-Cache? Shouldn’t all pages be in WP-Super-Cache. What does this mean to me?

@Donncha: I’ve read about the differences, but I still don’t understand why there is both Wp-Cache files and WP-Super-Cache files. I’ve always have had WP-Super-Cache enabled.

Donncha: I found the two errors that causes my troubles (I think).
Firstly there’s no check if the “refresh preloaded files every [] minutes” is disabled(zero) when it’s finished caching, meaning that if it’s set to zero it makes another caching once it’s finished with preloading the cache (which of course must be made with the preload cache now button).
Secondly, my page is too small (<1000 posts/pages), so posts_to_cache is never set which means wp_cache_preload_posts is set to zero which means it only makes one caching ($c =< 0?) and then thinks it is done.
So the cronjobs work just fine.

I was having the same problem as Daniel. Dowloaded the latest and greatest dev build to test. This seems to have fixed it for me too!

Whew! I thought I was going insane 😉

@Ipstenu: Or … not as much as I thought. I got up to “Currently caching from post 500 to 600” and it stalled. It took about 30 minutes to get there and now it’s hanging. It’d be nice if you could ‘cancel’ the pre-caching.

@Ipstenu: Oh, I get the same “problem” (in my own “fixed” version ^^, gonna change to donncha’s in a minute), but I thought it was because I have 302 posts/pages of which 2 (frontpage and blog page) was set to no caching… (Causing it to think I have 302 while only having 300).
Wellwell, not a large problem anyway…

@Donncha I didn’t see a fix for the plan cronjob after ‘forced’ preloading even if time is 0 problem though, I guess that problem would still remain…. even though posts_to_cache is fixed. Will check it out soon.
I also moderated “away” a comment (an old comment that shouldn’t be visible anymore) and it cleared away my whole cache even though I have “Only refresh current page when comments made.” set to true, bug or intended behaviour?

Yep, the infinite preloading loop when pressing “Preload Cache Now” with the timer set to 0 ( = no scheduled preloading? Should mean that) is still there. (As I said before, a check of timer = 0 should be done before scheduling next preloading).

Daniel – I’ve checked in a number of changes and that problem with the infinite loop is fixed now. The dev version should update in the next 15 minutes.

I got an error

———————————–
Rejected User Agents

Strings in the HTTP ’User Agent’ header that prevent WP-Cache from caching bot, spiders, and crawlers’ requests. Note that super cached files are still sent to these agents if they already exists.

Fatal error: Call to undefined function esc_html() in /www/htdocs/w0asdfff23/domain/wp-content/plugins/wp-super-cache/wp-cache.php on line 1069

Leave a Reply to Emerson Cancel reply