Creating Options in Automated Software Deployment

Adam Ross, Software Architect
#Devops | Posted

[section classes=""]Automated deployment is the practice of handling software change via predefined and repeatable processes that humans start but do not directly manage. This post helps you take your deployment system to the next level by considering a simple, alternate deployment process when speed matters.[/section]

The argument for automated deployment is a compelling one: minimized human error, minimize downtime from human speed compared to a machine, minimized risk of bugs from consistent and predictable process, all for a minimized resource investment in shipping code. For all these reasons Phase2 makes a common practice of automating many of these operations and looks for opportunities to empower our clients with effective tools to keep their web systems running smoothly.

Temptations always arise to go around the automated system—usually for reasons of speed or flexibility. It is often easily justified as a “one time need" to do something unique for the deployment, or get out a fix for a bug just a little faster. Going around your automated process is a risky decision, and one to be avoided lest you suffer one of these common fates.

  • In emergency-fix situations the stress of getting a solution in place can increase the probability of forgetting a crucial step.

  • Some steps might be carried out incorrectly because of human error.

  • Differences between the operating infrastructure of an automated process vs. the person managing deployment, or differences in file access permissions, can result in unexpected issues.

  • One element of the infrastructure might be missed, such as a second webserver.

  • Slow and painful rollback process because the automated deployment rollback is also not available.

There is always a role for people in controlling and monitoring deployments, and in performing special out-of-band operations and fixes. However, it might be worth adding a little complexity* to your automation software to match the speed goal for those occasions when a laborious automated process is simply too heavy. The cleanest dividing line is the difference between a deployment of code, and a deployment of database change.

* Keep it Simple! You are not in the business of building complex automation software!

The Full Deployment

In a full deployment of many web applications, especially Drupal, there are a number of steps to be taken. Here are a few generic operations you will see in most Features-based Drupal sites and their impact on the production environment:

  • Backup the Database → Protection from Data Change Errors
  • Upload new code → Code Change
  • Turn on Maintenance Mode → Low-risk Data Change
  • Run update operation → Data Change
  • Import Configuration from Code to Database → Data Change
  • Clear the Caches → No-risk Data Change
  • Turn off Maintenance Mode → Low-risk Data Change

In a large Drupal site, that could be anywhere from ten minutes to an hour depending on how long it’s been between deployments and how much content there is.

What if I said you don’t always need all those expensive, long-running steps?

The Code-only Deployment

There are times when you don’t technically need to run every single operation as part of a deployment. Sure, the automated system might enforce it, but the sorts of changes you are deploying may have no database impact. In those cases, most of the steps we are running seem unnecessary. Create a simple, manual toggle in your deployment system by which a member of the technical staff can select which type of deployment to run, and those operations can be cut down:

  • Backup the Database → Protection from Data Change Errors
  • Upload new code → Code Change
  • Turn on Maintenance Mode → Low-risk Data Change
  • Run update operation → Data Change
  • Import Configuration from Code to Database → Data Change
  • Clear the Caches → No-risk Data Change
  • Turn off Maintenance Mode → Low-risk Data Change

We are down to uploading the new code, and clearing the caches to make sure the content cached for the previous codebase does not collide with the new code. This can create an order of magnitude speed-up improvement to your deployment time.

When Does Code-only Make Sense?

There are 2½ cases which must be met in order to safely run a code-only deployment.

  • There are no database updates bundled with any module updates.

  • There are no configuration changes to be imported from the code, such as you would need if managing your site with the Features module when any specific Feature has been changed.

  • {Now for the halvesy} None of the developers has a “special procedure” or “extra script” they want to run right after development. (You’ve disabled the database backup, remember!)

Given that your release manager will know if a developer is asking permission to run a script, that leaves us with the first two areas to explore.

When is a Database Update Needed?

 

Maybe you read all the code for new versions of Drupal and modules in the system. That is a very good, but technically heavy process to follow. On the other hand, if you are relying on the community to have supplied you with good software, and just need to know in a hurry whether an update operation is needed, simply try running one on a copy of the release candidate code and the latest developer-sanitized snapshot of the production database.

If you begin running update.php or drush updatedb and no updates are needed, the system will tell you so. You can now proceed knowing that none of the modules want to move any data or database schemas around in order to function properly.

When Does Configuration Need Import?

 

Also known by it’s more jargony alias “When do I need to run Features Revert on my configuration modules?” This is often asked and always difficult to answer. The current tools in Drupal simply do not provide an easy and clean way to declare when something is no longer managed by the UI, and instead has been deliberately set by developers and site builders to a specific setting enforced by code in a module. It is up to individual administrators to build a solid understanding of the system they work with day to day.

If your development team is managing a lot of configuration for your system, there are a few rules of thumb that emerge from best practices in Drupal that might help you learn the site.

  • Content types, fields, vocabularies (but not taxonomy terms), contexts, and views are very likely in code. Any changes made to these will need to be exported as part of the codebase, and any time the code has been changed with such "exports" your next deployment will require a a database-changing configuration import. Anything new you add to the site in these categories will not be affected by a configuration import, but it is a bad idea to get into a place of “mixed management”.

  • Anything under admin/config (but not path aliases or redirects) that the team does not specify as available for use.

If a given deployment has any functionality tied to these or similar “managed components”, you should start planning on a full deployment.

The Best Way to Know if Code-only Would Work

Get everyone with access to make changes to the site’s codebase or staging database together in a meeting. Ask everyone.

Suppose You Want to Try This with Capistrano...

There are many great approaches to automated deployment tools you can use. ant, phing, make, grunt, bash, aegir, and of course, just about any of those things wrapped in Jenkins or another job runner. Most recently I needed to set this up with Capistrano 2, which is an excellent Ruby-based system you can use with Drupal. Metal Toad has an excellent write-up on Capistrano 2 still relevant years later.

After some consideration, I leveraged Cap’s multi-environment system, and created special code-only versions of the normal deployment environments with slightly different server settings. Making this work was tricky, and required some careful googling to find clues such as what to do if the Capistrano script does not see any servers available for an operation.

Here is the notion.

Define Your Environments

 

Note that both environments use the same webserver, but have different attributes.

 

The Full Deployment Environment

 

role :web, "web.example.com", { :fulldeploy => true }

The Code-only Deployment Environment

 

role :web, "web.example.com", { :fulldeploy => false }

Make Your Tasks Adaptive

Here’s an example of a Capistrano task that will run the database update operation if the webserver is configured as “fulldeploy”.

  1.  desc "run drupal update"
  2.  task :updb, :roles => :web, :only => { :fulldeploy => true }, :on_no_matching_servers => :continue do
  3.    run "drush -r #{current_path} -y updatedb"
  4.  end

 

 

Adam Ross

Software Architect