Drupal Is Not Your Web Site (A Primer)

Tobby Hagler, Director of Engineering
#Drupal | Posted

Next weekend, I'll be at NYC Camp delivering a session entitled "Drupal is not your website: Develop for high-scale, fragmented sites." It will be a beginner's look at developing beyond the CMS for a more-complete, complex Web site that makes use of fragments for performance gains as well as real-time updates.

When building a small or simple website with Drupal, most of the time the end-user interacts with Drupal directly. We as Drupal developers start off with these sorts of projects where the application we develop (replete with custom modules, a theme full of templates, and content we've migrated) is the end result that every Web user sees.

Over time our projects grow in size and complexity, and our skillsets along with them. Our Drupal instance ceases to be the sole recipient of our technical labours, and users must interact with the CMS in different ways. It is now when we are reminded that the CMS we are building is not actually the "thing" that Web users are directly looking at; Drupal is not our Web site.

The real trick is developing a Drupal instance that doesn't contain all the elements of your Web site while making all the parts and fragments interact as if they were all in the same CMS.

Performance and Scale

As traffic to our site grows, we begin to scale our infrastructure. We're already caching (page cache, Memcache, etc.) but that's still not enough; so we place a reverse-proxy server in front to handle the unauthenticated traffic. This allows us to render pages one time and deliver that cached version to many users. Thus, load on the Web server is reduced, and our Drupal instance stops crying for a little longer.

But even that may not be enough. Eventually your projects move to load balanced environments and you face the complexities of file management spread across multiple servers and file systems. Then you'll front-end your CMS with a content delivery network (CDN) and offload much the ancillary assets to external systems. With a good CDN in place, you begin to peel off fragments of your pages and host them elsewhere and let edge-side includes (ESI) assemble the page together on the Edge.

The more your site is splintered across multiple servers, the more you have to consider how you process content and files. Running batch processes means that some batch requests will not be handled by the same Web server that began the process (which may or may not have access to all the same temporary files or content).

Blurring the Lines Between Static and Dynamic Pages

One of the many things Drupal provides is dynamically generated pages. Your content changes over time and you want the pages of your Web site to reflect these them quickly. Dynamic page generation comes at a cost, and since your content isn't constantly changing, you'll want to cache many of your pages. This effectively means your Web page is noticeably stale; your fresh content is in the database waiting to be seen, but no one can until cache is cleared or invalidated.

What about content that needs to change in real time? User comments provide a new level of customer engagement and are generated much more frequently than the rest of the page's content.

There are ways to serve cached content to authenticated Drupal users. Using ESI, you can serve cached content with Edge-side include tags which are parsed by the CDN. These tags can reference un-themed HTML fragments that your CMS generates. The "Welcome Username" in the header of your page need not be part of your actual Drupal theme directly. In fact, the bulk of your page's content assumes the user is unauthenticated while the user still receives a customized experience.

Web sockets are another means to achieving real-time content updates while still caching the rest of your site. Live scoring or event updates need to reflect the actual progress of the game, so offload the updates to your users' browsers. In this scenario, you serve a statically cached page with a placeholder for scores (or any other rapidly changing content). Their browser opens a web socket to an API somewhere (outside of Drupal) and updates are pushed directly to the browser and updated on the page without the need for refreshing or time-based polling.

Helpful Drupal modules:

 Hope to see you at NYC Camp!



Tobby Hagler

Tobby Hagler

Director of Engineering