The Importance of Caching WordPress

Nginx Configurations

The following are the server configuratinos used for each scenario.

W3 Total Cache (no file-based caching, just memcache)

server {
	listen 80;
 
	server_name kbeezie.com www.kbeezie.com;
	root html/kbeezie.com;
 
	access_log logs/kbeezie.access.log;
	error_log logs/kbeezie.error.log;
 
	# Simply using try_files, tests the request uri against a file, then folder
	# then if neither can be found, the request is sent to index.php
	# this is a lot simpler than the .htaccess method of rewriting permalinks
 
	location / { 
		try_files $uri $uri/ /index.php; 
	}
 
	# Normally you do not need this if you are not using any error_page directive
	# but having it off allows Wordpress to return it's own error page
	# rather than the plain Nginx screen
 
	fastcgi_intercept_errors off;
 
	# Caching the typical static files such as css, js, jpg, png and so forth
	# helps in telling the browser they can cache the content
	location ~* \.(ico|css|js|gif|jpe?g|png)$ {
		expires max;
		add_header Pragma public;
		add_header Cache-Control "public, must-revalidate, proxy-revalidate";
	}
 
	# I like to place my php stuff into it's own file
	# see http://kbeezie.com/view/nginx/ for more information
	include php.conf;
 
	# We don't really need to log favicon requests
	location = /favicon.ico { access_log off; log_not_found off; }	
 
	# We don't want to allow the browsers to see .hidden linux/unix files
	location ~ /\. { deny  all; access_log off; log_not_found off; }
}

WP SuperCache (file-based caching)

This configuration only differs from gzip precompression by a single line “gzip_static on;” , which would serve a gzipped file if one exists with the same name of the request. For example if the browser asked for index.html, and there exists an index.html.gz in the same folder, Nginx will serve that file to gzip-enabled browsers when the gzip static module has been enabled. Otherwise it will gzip the requested file on the fly.

server {
	listen 80;
 
	server_name kbeezie.com www.kbeezie.com;
	root html/kbeezie.com;
 
	access_log logs/kbeezie.access.log;
	error_log logs/kbeezie.error.log;
 
	location / { 
		# This line when enabled will use Nginx's gzip static module
		gzip_static on;
 
		# Disables serving gzip content to IE 6 or below
		gzip_disable        "MSIE [1-6]\.";
 
		# Sets the default type to text/html so that gzipped content is served
		# as html, instead of raw uninterpreted data.
		default_type text/html;
 
		# does the requested file exist exactly as it is? if yes, serve it and stop here
		if (-f $request_filename) { break; }
 
		# sets some variables to help test for the existence of a cached copy of the request
		set $supercache_file '';
		set $supercache_uri $request_uri;
 
		# IF the request is a post, has a query attached, or a cookie
		# then don't serve the cache (ie: users logged in, or posting comments)
		if ($request_method = POST) { set $supercache_uri ''; }
		if ($query_string) { set $supercache_uri ''; }
		if ($http_cookie ~* "comment_author_|wordpress|wp-postpass_" ) { 
			set $supercache_uri ''; 
		}
 
		# if the supercache_uri variable hasn't been blanked by this point, attempt
		# to set the name of the destination to the possible cache file
		if ($supercache_uri ~ ^(.+)$) { 
			set $supercache_file /wp-content/cache/supercache/$http_host/$1index.html; 
		}
 
		# If a cache file of that name exists, serve it directly
		if (-f $document_root$supercache_file) { rewrite ^ $supercache_file break; }
 
		# Otherwise send the request back to index.php for further processing
		if (!-e $request_filename) { rewrite . /index.php last; }
	}
 
	# Normally you do not need this if you are not using any error_page directive
	# but having it off allows Wordpress to return it's own error page
	# rather than the plain Nginx screen
 
	fastcgi_intercept_errors off;
 
	# Caching the typical static files such as css, js, jpg, png and so forth
	# helps in telling the browser they can cache the content
	location ~* \.(ico|css|js|gif|jpe?g|png)$ {
		expires max;
		add_header Pragma public;
		add_header Cache-Control "public, must-revalidate, proxy-revalidate";
	}
 
	# I like to place my php stuff into it's own file
	# see http://kbeezie.com/view/nginx/ for more information
	include php.conf;
 
	# We don't really need to log favicon requests
	location = /favicon.ico { access_log off; log_not_found off; }	
 
	# We don't want to allow the browsers to see .hidden linux/unix files
	location ~ /\. { deny  all; access_log off; log_not_found off; }
}

No Caching

The configuration for the no-cache scenario is the same as W3-Total-Cache above, since both leave PHP to handle the processing. The main difference is, on the PHP side, W3 Total Cache is caching the pages, minify results, database queries, and objects.

server {
	listen 80;
 
	server_name kbeezie.com www.kbeezie.com;
	root html/kbeezie.com;
 
	access_log logs/kbeezie.access.log;
	error_log logs/kbeezie.error.log;
 
	# Simply using try_files, tests the request uri against a file, then folder
	# then if neither can be found, the request is sent to index.php
	# this is a lot simpler than the .htaccess method of rewriting permalinks
 
	location / { 
		try_files $uri $uri/ /index.php; 
	}
 
	# Normally you do not need this if you are not using any error_page directive
	# but having it off allows Wordpress to return it's own error page
	# rather than the plain Nginx screen
 
	fastcgi_intercept_errors off;
 
	# Caching the typical static files such as css, js, jpg, png and so forth
	# helps in telling the browser they can cache the content
	location ~* \.(ico|css|js|gif|jpe?g|png)$ {
		expires max;
		add_header Pragma public;
		add_header Cache-Control "public, must-revalidate, proxy-revalidate";
	}
 
	# I like to place my php stuff into it's own file
	# see http://kbeezie.com/view/nginx/ for more information
	include php.conf;
 
	# We don't really need to log favicon requests
	location = /favicon.ico { access_log off; log_not_found off; }	
 
	# We don't want to allow the browsers to see .hidden linux/unix files
	location ~ /\. { deny  all; access_log off; log_not_found off; }
}

Update: W3 Total Cache rewrite Rules for Disk (enhanced) caching
The following rules have been updated to work with W3TC 0.9.2.8, also included is support for gzipped content. In some cases $host may need to be changed to the actual domain name.

server {
	listen 80;
 
	server_name kbeezie.com www.kbeezie.com;
 
	root /usr/local/www/kbeezie.com;
 
	access_log /var/log/nginx/kbeezie.access.log;
	error_log /var/log/nginx/kbeezie.error.log;
 
	location / { 
		if (-f $request_filename) {
		        break;
		}
 
		set $w3tc_rewrite 1;
		if ($request_method = POST) { set $w3tc_rewrite 0; }
		if ($query_string != "") { set $w3tc_rewrite 0; }
 
		set $w3tc_rewrite2 1;
		if ($request_uri !~ \/$) { set $w3tc_rewrite2 0; }
		if ($request_uri ~* "(sitemap(_index)?\.xml(\.gz)?|[a-z0-9_\-]+-sitemap([0-9]+)?\.xml(\.gz)?)") { set $w3tc_rewrite2 1; }
		if ($w3tc_rewrite2 != 1) { set $w3tc_rewrite 0; }
 
		if ($http_cookie ~* "(comment_author|wp\-postpass|wordpress_\[a\-f0\-9\]\+|wordpress_logged_in)") { set $w3tc_rewrite 0; }
		if ($http_user_agent ~* "(W3\ Total\ Cache/0\.9\.2\.4)") { set $w3tc_rewrite 0; }
 
		set $w3tc_ua "";
		set $w3tc_ref "";
		set $w3tc_ssl "";
		set $w3tc_enc "";
 
		if ($http_accept_encoding ~ gzip) { set $w3tc_enc _gzip; }
 
		set $w3tc_ext "";
		if (-f "$document_root/wp-content/cache/page_enhanced/$host/$request_uri/_index$w3tc_ua$w3tc_ref$w3tc_ssl.html$w3tc_enc") {
		    set $w3tc_ext .html;
		}
		if ($w3tc_ext = "") { set $w3tc_rewrite 0; }
 
		if ($w3tc_rewrite = 1) {
		    rewrite ^ "/wp-content/cache/page_enhanced/$host/$request_uri/_index$w3tc_ua$w3tc_ref$w3tc_ssl$w3tc_ext$w3tc_enc" last;
		}
 
		if (!-e $request_filename) { 
			rewrite ^ /index.php last; 
		}
	}
 
	location /search { limit_req zone=kbeezieone burst=3 nodelay; rewrite ^ /index.php; }
 
	fastcgi_intercept_errors off;
 
	location ~* \.(?:ico|css|js|gif|jpe?g|png)$ {
		expires max;
		add_header Pragma public;
		add_header Cache-Control "public, must-revalidate, proxy-revalidate";
	}
 
	# see http://kbeezie.com/view/nginx/ for more information
	include php.conf;
	location = /favicon.ico { access_log off; log_not_found off; }	
	location ~ /\. { deny  all; access_log off; log_not_found off; }
}

5 comments

  1. kbeezie says:

    In the memcached configuration, the obvious difference is that PHP is being accessed every single request to the site which has the additional overhead of PHP checking the request, then retrieving the keyed item from memcached and serving it. In the file-based configuration (which both W3 total cache, and WP supercache can be configured with) , Nginx is completely by-passing PHP all together and serving static content directly from the disk.

    As to why; first off I don’t get that much traffic and even if I do with the current configuration can still handle quite a bit. Also with this configuration I don’t have to modify nginx with any excessive rewrite rules to check for disk-based cache, and publishing new content is easily refreshed with the memcache setup. I used to use wp-super-cache with preloaded cache almost strictly which was fast indeed over very high load (which I almost never get), but it only caches files, it doesn’t improve the performance for users logged in, and I have to make sure to clear the disk cache when I make a change to the design or site.

  2. mastafu says:

    Great article.

    However I have following problem.

    My WP setup is like that.

    Currently I am on shared hosting with WP + W3 Total cache and during peak hours, my site is very slow. That is mainly because I have a huge traffic from Google.

    My webstie caches plenty of keywords with AskApache Google 404 and Redirection.

    What happens is that traffic from Google goes to /search/what-ever-keywords dynamicly created everytime. And that is killing my system.
    The problem is I have no idea how to help poor server and cache that kind of traffic.

    Would you have any advice for that ?
    Regards,
    Peter

  3. kbeezie says:

    That’s a rather good question, especially considering you can’t easily cache random searches. I was looking into it, it seems to also be a common way of overloading a wordpress site.

    The Nginx webserver does provide one feature that may help, called the Limit Request Module (http://wiki.nginx.org/HttpLimitReqModule)

    Essentially you could have a location block like so (the line above goes somewhere in http block):

    limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
    location /search { limit_req zone=one burst=5; rewrite ^ /index.php; }

    Essentially what happens is that the location /search is limited to a rate of 1 request per second based on a visitor’s IP address. A burst of 5 mins that they have only 5 times that they can exceed this rate before they are hit with a 503 error response. Google for example see’s 503 as kind of a de-facto “Back off” response.

    The rewrite is there since on wordpress there shouldn’t ever be an actual folder named search, and all search requests are going to /index.php anyways.

  4. mastafu says:

    kbeezie thank you for your replay.
    I think that this is not an issue here. What happens right now is that betwean 7pm till 9pm I am being strongly attacked from google … like 20-50 req/s
    So I would probably need 8 cores or smth … which is super expensive … plus only required for some time during a day.

    I need to look in to your limiting module, what would be great is that is someone searches to much, he should be redirected to main page, or specified page where he would see warning and not 503 error. That is big to drastic, I think.

    What do you think, is that possible ?

  5. kbeezie says:

    By the way I learned that the rewrite line will actually act before the limiting had a chance to act. So have to use try_files $uri /index.php; instead which allows the limiting module to act before trying for files.

    Far as google, 503 is the de-facto standard for “back off” to the google servers. You can however create a google webmaster account ( https://www.google.com/webmasters/tools/ ) , add your site, verify it, then set your crawl rate manually rather than google doing so automatically. This way you can have some control in preventing google from crawling your site too quickly.

    More hardware isn’t always the key to improving your site. Far as an 8 core, even a 4 core (or 4 core + hyper threading) would be fine and not all that expensive unless you go with prices from places like rackspace and such (though expensive if you’re only used to VPS pricing).