Development icon

Caching in Drupal

Chris Johnson, VP of Engineering
#Development | Posted

Caching is commonly used to boost the performance of Drupal based websites and there are many modules and options for fine grained control of when items expire. However, the standard caching settings available under performance are frequently misinterpreted and different caching backends have subtly different behavior. Here is a breakdown of what the settings mean for page caching and how the two most common storage engines behave for particular events.

 

Cache Pages for Anonymous Users

The first caching setting under performance is cache pages for anonymous users. This must be set for the other options below to take effect and this tells Drupal it’s ok to store the rendered version of a page generated for one non authenticated user and serve it to another. Pages are stored with a cache ID of the URL of the request, technically $base_root . request_uri(), so a request for http://example.com/page_one and http://example.com/page_one?rand=1 are considered two different requests and generated separately even if the query string portion has no effect on the rendering of the page.

Once page caching is enabled there are two settings to consider: Minimum cache lifetime and expiration of cached pages.

Minimum cache lifetime is often misinterpreted as meaning “pages will be regenerated after this much time has passed”. What it actually means is that pages will not be regenerated until at least this much time has passed and a cache clearing event has happened. I’ll discuss cache clearing events after covering expiration of cached pages.

Expiration of cached pages is also sometimes misinterpreted. This value controls what is sent as a max-age value in a Cache-Control header and thus advises proxy servers how long they may serve the page without asking your Drupal install for a new copy. This does not mean that the page will be regenerated after this much time, it just means that the proxy server must check back with Drupal to see if a new version of the page exists after this much time. Drupal will only regenerate a page after a cache clearing event occurs.

Cache Clearing Event

I’ve mentioned cache clearing event several times at this point and it’s the most important thing to understand when dealing with Drupal’s page caching. Drupal will only regenerate a page when it has some reason to suspect that the results of the page regeneration will be different than the previous results. The things that make Drupal think the page generation results may be different are what I call cache clearing events, primarily because they cause the cache_clear_all function to be called. What is considered a cache clearing event varies based on the cache storage engine being used and how it interacts with the minimum cache lifetime also will vary.

The two most common cache storage engines are the database and Memcached (DrupalDatabaseCache class and MemCacheDrupal class). Below is a chart showing their behavior during typical cache clearing events.

Cache Cron Cache Clear All Content Editing
Database Page cache will clear in minimum cache lifetime. Page cache clears immediately and will clear again in minimum cache lifetime. Page cache will clear in minimum cache lifetime.
Memcached Not considered a cache clearing event Page cache considers all items generated prior to the current time minus minimum cache lifetime expired (thus clearing them). Items generated after that are not considered for expiration until the next cache clearing event. Page cache considers all items generated prior to the current time minus minimum cache lifetime expired (thus clearing them). Items generated after that are not considered for expiration until the next cache clearing event.

 

The treatment of a cron run as a cache clearing event when using the database for cache storage combined with frequent cron runs is what typically leads users familiar with Drupal to think of minimum cache lifetime as “my pages will be regenerated after this much time has passed”. It is also a source of frustration for users who desire pages to remain cached until content is actually edited. Understanding the caching behavior of Drupal, it’s interactions with proxy servers like Varnish and how your cache storage engine reacts in various situations are all critical to ensuring your content is regenerated when necessary and properly cached when not.

Chris Johnson

Chris Johnson

VP of Engineering