Varnish is an open source, state of the art web application accelerator
.
What it does is make your existing site faster by caching requests so your web server doesn’t have to handle them. This helps because your web server may be a lumbering giant like Apache that is loaded up with extra functionality like PHP, the GD library, mod_rewrite and all the other tools you need to make your website. All these modules unfortunately make your general purpose web server slower and heavier so by avoiding it your site spits out pages much faster!
Varnish sits in front of
your webserver. Most documentation I’ve read on the subject suggest having Apache listen on any port other than port 80 and then have Varnish listen on port 80 of the external IP address. There’s no need to do this as I configured Apache to listen on port 80 of the 127.0.0.1 or localhost address while Varnish sits on the external IP.
Installing Varnish
Setting up Varnish is fairly easy. I’m going to assume that you’re already using Apache and On a Debian based system just use this to install it (as root)
apt-get install varnish
Apache
You need to configure Apache first. It has to listen on port 80 of the localhost interface. Edit /etc/apache2/ports.conf and change the following settings:
NameVirtualHost 127.0.0.1:80 Listen 127.0.0.1:80
Normally Apache listens on port 80 of all interfaces so you’ll probably just have to add “127.0.0.1:” in front of the 80.
Varnish
By default Varnish won’t start. You need to edit /etc/default/varnish. Change the following options in that file:
START=yes DAEMON_OPTS="-a EXTERNAL_IP_ADDRESS:80 \ -T localhost:6082 \ -f /etc/varnish/default.vcl \ -S /etc/varnish/secret \ -s file,/var/lib/varnish/$INSTANCE/varnish_storage.bin,1G"
Replace EXTERNAL_IP_ADDRESS with the IP of your external IP address.
Now edit /etc/varnish/default.vcl. The file should already exist but most of it is commented out. First of all change the “Backend default”:
backend default { .host = "127.0.0.1"; .port = "80"; }
This tells Varnish that Apache is listening on port 80 of the localhost interface.
I’m going to define several functions in the default.vcl now. Comments in the code should explain what most of it does.
# Called after a document has been successfully retrieved from the backend. sub vcl_fetch { # Uncomment to make the default cache "time to live" is 5 minutes, handy # but it may cache stale pages unless purged. (TODO) # By default Varnish will use the headers sent to it by Apache (the backend server) # to figure out the correct TTL. # WP Super Cache sends a TTL of 3 seconds, set in wp-content/cache/.htaccess # set beresp.ttl = 300s; # Strip cookies for static files and set a long cache expiry time. if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") { unset beresp.http.set-cookie; set beresp.ttl = 24h; } # If WordPress cookies found then page is not cacheable if (req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)") { set beresp.cacheable = false; } else { set beresp.cacheable = true; } # Varnish determined the object was not cacheable if (!beresp.cacheable) { set beresp.http.X-Cacheable = "NO:Not Cacheable"; } else if ( req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)" ) { # You don't wish to cache content for logged in users set beresp.http.X-Cacheable = "NO:Got Session"; return(pass); } else if ( beresp.http.Cache-Control ~ "private") { # You are respecting the Cache-Control=private header from the backend set beresp.http.X-Cacheable = "NO:Cache-Control=private"; return(pass); } else if ( beresp.ttl < 1s ) { # You are extending the lifetime of the object artificially set beresp.ttl = 300s; set beresp.grace = 300s; set beresp.http.X-Cacheable = "YES:Forced"; } else { # Varnish determined the object was cacheable set beresp.http.X-Cacheable = "YES"; } if (beresp.status == 404 || beresp.status >= 500) { set beresp.ttl = 0s; } # Deliver the content return(deliver); } sub vcl_hash { # Each cached page has to be identified by a key that unlocks it. # Add the browser cookie only if a WordPress cookie found. if ( req.http.Cookie ~"(wp-postpass|wordpress_logged_in|comment_author_)" ) { set req.hash += req.http.Cookie; } } # Deliver sub vcl_deliver { # Uncomment these lines to remove these headers once you've finished setting up Varnish. #remove resp.http.X-Varnish; #remove resp.http.Via; #remove resp.http.Age; #remove resp.http.X-Powered-By; } # vcl_recv is called whenever a request is received sub vcl_recv { # remove ?ver=xxxxx strings from urls so css and js files are cached. # Watch out when upgrading WordPress, need to restart Varnish or flush cache. set req.url = regsub(req.url, "\?ver=.*$", ""); # Remove "replytocom" from requests to make caching better. set req.url = regsub(req.url, "\?replytocom=.*$", ""); remove req.http.X-Forwarded-For; set req.http.X-Forwarded-For = client.ip; # Exclude this site because it breaks if cached #if ( req.http.host == "example.com" ) { # return( pass ); #} # Serve objects up to 2 minutes past their expiry if the backend is slow to respond. set req.grace = 120s; # Strip cookies for static files: if (req.url ~ "\.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|pdf|txt|tar|wav|bmp|rtf|js|flv|swf|html|htm)$") { unset req.http.Cookie; return(lookup); } # Remove has_js and Google Analytics __* cookies. set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", ""); # Remove a ";" prefix, if present. set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", ""); # Remove empty cookies. if (req.http.Cookie ~ "^\s*$") { unset req.http.Cookie; } if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed."; } purge("req.url ~ " req.url " && req.http.host == " req.http.host); error 200 "Purged."; } # Pass anything other than GET and HEAD directly. if (req.request != "GET" && req.request != "HEAD") { return( pass ); } /* We only deal with GET and HEAD by default */ # remove cookies for comments cookie to make caching better. set req.http.cookie = regsub(req.http.cookie, "1231111111111111122222222333333=[^;]+(; )?", ""); # never cache the admin pages, or the server-status page if (req.request == "GET" && (req.url ~ "(wp-admin|bb-admin|server-status)")) { return(pipe); } # don't cache authenticated sessions if (req.http.Cookie && req.http.Cookie ~ "(wordpress_|PHPSESSID)") { return(pass); } # don't cache ajax requests if(req.http.X-Requested-With == "XMLHttpRequest" || req.url ~ "nocache" || req.url ~ "(control.php|wp-comments-post.php|wp-login.php|bb-login.php|bb-reset-password.php|register.php)") { return (pass); } return( lookup ); }
Notes:
- Varnish caches Javascript and CSS files without the cache buster ?ver=xxxx parameter. Varnish doesn’t cache any url with a GET parameter so those files weren’t getting cached at all.
- The code removes the Cookies for Comments cookie after it checks for GET and HEAD requests. This improved caching significantly as web pages are not cached with and without that cookie. They are all cached without it. The cache hit/miss ratio went up significantly when I made these two changes.
- I have a private site on this server that requires login. I had to stop Varnish caching this site as the privacy plugin thought I wasn’t logged in. See the example.com code above.
- If pages were purged Varnish could store cached pages for much longer.
As I didn’t modify WordPress so it would issue PURGE commands there are probably issues with the cache keeping slightly stale pages cached but I haven’t seen it happen or receive complaints about that.
PHP
Since all requests to Apache come from the local server PHP will think that the remote host is the local server. By using an auto_prepend_file set in your php.ini or .htaccess file you can tell PHP what the real IP is with this code:
if ( isset( $_SERVER[ "HTTP_X_FORWARDED_FOR" ] ) ) { $_SERVER[ 'REMOTE_ADDR' ] = $_SERVER[ "HTTP_X_FORWARDED_FOR" ]; }
You’ll see a huge improvement if you use Apache, especially if you don’t use a full page caching plugin like WP Super Cache on your WordPress site.
To see exactly how well Varnish is working use varnishstat and watch the ratio of cache hit and miss requests. This will vary depending on your TTL and by how much time Varnish has had to populate the cache. You can also configure logging using varnishncsa as described on this page:
varnishncsa -a -w /var/log/varnish/access.log -D -P /var/run/varnishncsa.pid
Now use multitail to watch /var/log/varnish/access.log and your web server’s access log.
I used a number of sites for help when setting this up. Here are a few:
- Tutorial: Setting up Varnish with Apache.
- Putting Varnish In Front Of Apache On Ubuntu/Debian
- WordPress + nginx + Varnish + Apache 2
- MediaWiki Manual:Varnish caching
- Using Varnish So News Doesn’t Break Your Server
I have tried Nginx in the past but could not getting it working without causing huge CPU spikes as PHP went a little mad. In comparison, Varnish was simple to install and set up. Have you tried Varnish yet? How can I improve the code above?
Edit: It looks like someone else has done the hard work. I must give the WordPress Varnish plugin a go.
This plugin purges your varnish cache when content is added or edited. This includes when a new post is added, a post is updated or when a comment is posted to your blog.
Glad the article on my site was of use to you. =)
Support for Varnish purging is one of a couple of reasons why I switched from WP Super Cache to W3 Total Cache (sorry!!). Everything’s super quick, Varnish serves the static files much quicker than Apache, and the number of concurrent users Varnish can handle is positively obscene. With WordPress now purging Varnish when content is updated, I’ve barely had any situations where Varnish is serving out-of-date content, too.
Varnish is genuinely amazing. Supremely capable, and with a learning curve that’s essentially flat. You can get it up and running in about an hour, and can get a pretty awesome hit rate in an afternoon — it’s lovely.
http://cd34.com/blog/infrastructure/w3-total-cache-and-varnish/
Last time I checked W3TC causes issues with Varnish that another required plugin solves on WordPress installations.
Rob – if you ever feel the need to tinker with caching plugins again I found a WordPress plugin that purges the Varnish cache! It’s in the post above right at the end.
And I agree, it’s fairly easy to get Varnish going it’s a shame it’s not given more publicity. I did find that my config needed some fine tuning, and it’s not quite as versatile as a PHP script is (ie. mobile client support) but it’s only a matter of putting in some time to figure it out.
Not sure why anyone would prefer a Varnish+Apache setup when Nginx-PHP-FPM+WP Super Cache is much faster, and easier to setup. Nginx can go even faster than the benchmark in the link below using its built-in ncache module, which is also easy to setup using tutorials available.
http://nbonvin.wordpress.com/2011/03/14/apache-vs-nginx-vs-varnish-vs-gwan/
If the Gwan (http://gwan.com/) web server begins to roll out configurations that can replace Apache and Nginx, it could become the dominant web server. Right now Nginx is still the fastest web server that can be used for WordPress, without the Apache bloat.
Todd – in my case Varnish was much easier to set up as I didn’t have to learn about how nginx does virtual hosts, or change Apache rewrite rules or any of the other Apache things I use.
I saw that post yesterday. Gwan is interesting but a non-starter. I cannot imagine writing web apps in C that need to be compiled to run. The author’s attitude to Windows is so immature it put me off even trying it too. I don’t care for Windows as a server operating system but he went out of his way to appear childish in mocking it.
I’ve done something similar. Except I’ve gone for a Nginx + apache setup.
Apache is set to port 81 (Simply so i can bypass Nginx easily), Nginx does a reverse proxy on port 80 to 81.
Primary gain to me is I get to continue to use Apache functionality (mod_php + the http svn module) with a config file I’m used to, and didn’t have to worry about configuring fastcgi PHP processes (a pain that’s always bitten me), and Nginx sitting passivly serving up the cached content.
In the process of setting that up, I dumped using WP Super cache, as Nginx is doing a far “better job”, I dont exactly have a high ammount of traffic, but my memory consumption is sitting nice, CPU usage is constant, and the logs indicate constant requests happening.
I tried using Nginx and Apache too but that didn’t work either! Like you I’m “blessed” with a site with little traffic so even if there are better solutions out there I’m not in dire need of looking for it!
re: G-wan — Good reading your comment. I was genuine excited about that project, but then I came in contact with its author. Never have I met a worse advocate for a software project… Simply by stating the tone of an article comparing Nginx to Gwan wasn’t very neutral (It’s not, he went into full-on attack mode, dropping ad-hominems about everything from my computer skills to my profession (I’m a journalist by trade).
Such a shame, too; its performance figures had my attention for a second, there…
Nginx + PHP + WordPress is not hard to pull off and what are these PHP spikes you speak of?
After I got it working and PHP configured to run persistently the load average on the server went up and up all the time. I couldn’t figure out what was wrong but the PHP processes were there right at the top of the output of top.
When I went back to Apache things got back to normal. It wasn’t that the Supercache files weren’t being used or whatever because when I went back to Apache that was serving content directly for a bit while I got the mod_rewrite rules in there.
I use php-cgi and don’t have much problems, plus I use memory caching with memcached and object-cache.php which enables memcache to work.
Oh you know what, php-cgi carries a high load sometimes a /etc/init.d/php-cgi restart sometimes helps
wanted to give a try but i think to set up that little bit technical for non techie guy like me…any other way to make it simple.
neo – unfortunately this is as simple as it gets, but worth looking at as it will make your site faster!
This doesn’t seem to me if it would work with multiple virtual hosts. Do I understand that correct?
It works just fine with multiple hosts. I have about 5 different domains on this server, all going through Varnish. The hash key is generated from the hostname + url, with the cookie added in the code above.
Ah i see, thanks!
Honestly I tried a cPanel plugin that adds Varnish support. I could not measure any performance difference on my server, and the plugin was riddled with bugs.
Your manual install method may be more efficient. Or maybe my server is powerful enough not to need any of these extras.
Peace,
Gene
Very nice post – great advice!
However, I have noticed warnings that Varnish is designed for 64-bit systems due to the availability of virtual memory.. I don’t know if it is entirely true – some people seem to have been able to tweak it to work on 32 bits.
But if it is correct, then perhaps you ought to mention in the intro to your post that this only works on 64 bit OS-es?
I had no idea about the 32bit vs 64bit issue but from a very brief Google search it only seems to be an issue on very large servers using more than 2GB of cache space. I’m only using 1GB of cache so it’s not an issue and I doubt it’ll be an issue for most blogs.
After reading this thread I checked varnishstats again:
Loads of free space!
I notice you have no ads on this site. What happens if you have Google and/Amazon ads that want to send cookies? If every page has ads, does that mean no pages get cached?
Donncha,
Out of interest, has this had a measurable impact using:
* Firebug
* Google Page Speed
* Google Webmaster Tools
* Google Analytics Site Performance
* …
Al.
I didn’t try any of those but it probably only makes a small difference when there’s normal traffic (if you already cache WordPress pages that is). It’s when there’s an increase in traffic that Varnish would be able to handle serving those requests faster than Apache would.
Donncha, do you use WP Super Cache on your sites together with Varnish?
Yes, it works quite well. I don’t honestly need it on this server because my traffic has plummeted over the last year or so but it’s that one extra line of defence in case I get a surge of traffic.
You might enjoy this more verbose wordpress varnish VCL that includes:
Features:
Load balancing
Probing
Does not cache wp-admin
Puts all uploads/content requests onto one server
Purging
Long timeout for file uploads
XML RPC support
Custom 404 and 500 message
Forwards user IP address for comments
I’m wondering if you’ve gone for a corresponding decrease in the number of Apache servers that get started / held spare with this Varnish setup?
Is this plugin useful (or will it even work) if I’m using MaxCDN to serve up my static files? I’m using W3 Total Cache as my caching plugin.
If using Debian, just run “apt-get install libapache2-mod-rpaf”. This will look at the “X-Forwarded-For” headers and change the remote ip shown to other modules. It can probably be added to Centos as well if you compile it: http://stderr.net/apache/rpaf/
By external ip address do you mean public or private ip? cause i followed your steps on my centos 5 machine, when i check the http headers I don’t see varnish at all…
Hi,
Just a word or warning, having found this and used it as the basic of a VCL I have found that it uses directives that don’t work in since Varnish 3.0, notably beresp.cacheable
would be nice to have an updated one published