The Art of SSH Key Management with Puppet

Phase2 | Digital Agency

December 22, 2014

DevOps

In August of 2014, the National Institute of Standards and Technology published a paper covering the security concerns organizations should understand when using SSH in their environments, particularly when used as part of an automation strategy.

Titled “Security of Automated Access Management Using Secure Shell (SSH)” (NISTIR 7966), it’s arrival generated a deal of attention in security and I.T. focused media, but not as much among those of us practicing continuous integration and devops. Perhaps it should.

It’s no secret that many of our toolsets, workflows, and hosting providers are often dependent on SSH and SSH keys to manage access to systems, repositories, and automation frameworks. However, in many of the environments that I’ve worked with, effective management of SSH keys either isn’t considered, or isn’t well understood. In the environments where it was understood, the people well positioned to follow the best practices suggested in NISTIR 7966 were often not empowered to take positive action.

The guidelines laid out in NISTIR 7966 are great, and the details of their recommendations are definitely worth the read. I believe, within the context of continuous deployment though, they can be briefly summarized as knowing what your automation tools are empowered to do with the credentials you provide for them, and establishing a flexible and complete strategy to manage those credentials in the long term.

With many SSH implementations, you do have several options with which to enable authentication. By far though, the most popular is likely SSH public keys (and/or certificates, which, for our purposes here, we’ll treat as the same). If you’re using a deploy key on your Bitbucket/Github repository, or automating any number of tasks across your infrastructure via SSH enabled Jenkins/Buildbot jobs, you’re already in the target audience for NIST’s advice.

Aligning NIST’s recommendations along our two simplified goals with a focus on SSH keys, we find their recommendations might look like:

Knowing

Have an up to date inventory of all enabled SSH keys in your organization
Understand the ways deployed keys enable access across your infrastructure
Establish clear distinctions between systems with differing security needs
Have clear policies in place that limit key use to only the minimum privileges necessary.
Monitor changes to your deployed keys, and to authorized keys files

Managing

Enforce strong SSH client and server configurations
Enforce regular key rotation
Have policies in place for the life cycle of your deployed keys, with provisions for prompt deactivation of keys which are no longer used, or which have been compromised
Establish limits for where keys can be used, and what they can be used for
Automate key provisioning processes to enforce consistent deployment according to policy.

If you have the good fortune to have an infrastructure designed from the ground up, purpose built to strict requirements, and deployed with careful attention to consistency and with best in class automation tools, the ‘knowing’ part of the recommendations may already be done. If you have infrastructure like the majority of people though, spending time now reviewing your infrastructure, and how best to use keys to secure it, will pay dividends.

Start the process by capturing the roles your systems play in your infrastructure. Webservers, database servers, etc. are typical roles, but you should not hesitate to use unique descriptions appropriate to your own infrastructure.

Do your automated deployment systems use keys that grant access across many (maybe all) of the system roles in your environment? Is that kind of access really necessary?

Work through the details of the systems to which you’ve assigned roles, and add as much detail to each role as possible. What data is handled by each role? What users and/or automated processes may need access to systems performing that role? If you were to limit access to a given role, what might break (if anything)?

Use the answers to these questions to build a picture of the ways you can, and should, divide access to your infrastructure You’ll also be able to identify the different security needs for your internal systems. Build policies around those needs, and make certain those policies are clear, unambiguous, and consistent. Whenever possible, use automation tools, such as Puppet, to assist in the consistent application of those policies.

Although NISTIR 7966 recommends automation as a key component to effectively meet these goals, they don’t recommend any specific product. I’m a fan of Puppet , so I would suggest strategies based around Puppet best practices.

Establishing an inventory of your SSH keys is trivial in Puppet environments which utilize hiera. Using the default YAML backend, and a Puppet module from the Forge (mthibaut/users, for instance, although there are many good choices), you could quickly build your inventory while establishing a framework for meeting some of the other recommendations.

Enforcing strong SSH client and server configurations can be accomplished with the same combination of hiera and the right Puppet module (perhaps saz/ssh?)

While puppet reduces the burden of deploying good policy, you’ll still need to establish those policies in the first place. NIST recommends that you treat your SSH keys as you would your passwords. Make certain keys which have been compromised are removed quickly and completely, while valid deployed keys are rotated out of service regularly for users and automated processes alike. When users no longer need access to resources, those keys should obviously be removed rapidly as well.

There are more detailed recommendations in NISTIR 7966, but for most environments, starting out by establishing some simple policies with good automated enforcement is an important first step. If you are not currently implementing these recommendations, you should consider them sooner, rather than later.

Recommended Next

DevOps

Common Configuration Changes for Your Outrigger-based Project

DevOps

Adding MySQL Slow Query Logs to Logstash

DevOps

Static Drupal - Taking the pain out of Drupal hosting