Visual Regression Testing Part 3: Integrating PhantomCSS and Grunt with Jenkins

Austin Corso, Developer
#Testing | Posted

The suite of visual regression tests detailed in part 1 of this series provided the Department of Energy with a solid system to test key components, layouts, and styling throughout the platform. In part 2 of this series, the automation provided through Grunt reduced the tedious need for developers to alternate running the suite between multiple breakpoints and multiple environments, each with specific URL’s. We had almost reached the state of integration we were striving for, but one key piece remained. We longed to eliminate the need for a developer to carry out the running of these tests. We wanted to have the test suite run completely autonomously, preparing a fresh list of test results for developers routinely. It was time to integrate our Grunt task and PhantomCSS test suite with Jenkins!

The initial step was to set up a Jenkins instance, but I won’t be covering that in this post (we already had a Jenkins instance setup for a wide variety of other routine tasks on the Department of Energy platform). Instead, I’ll jump right into how we crafted two new Jenkins jobs, and (with slight modifications to two other pre-existing jobs) we were able to fully automate the platforms testing suite.

Jenkins Job #1: Running the Test Suite

Creating a new Freestyle Project, we run through a few initial configurations, setting the repository URL, our build name, email notifications, and build history appropriately. We then set parameters for the job. Our parameter would direct the test suite to run against the environment of our choosing. In this case: production, staging, and integration. Then it was time to get down and dirty with bash.

Using bash, we have our job change into the directory of our test suite and it’s corresponding Grunt makefile. Then, we clear out the previous test runs results within that environment and kick off PhantomCSS. Here, in kicking off PhantomCSS, we pass in the environmental parameter to the Grunt task we set up in part 2 of this blog series.

  1. #!/bin/bash -ex
  2. export TERM=xterm
  3. export DRUSH_PHP=/usr/bin/php
  4. DRUSH=/opt/drush/drush
  5. NPM=/usr/bin/npm
  6. GRUNT=/usr/bin/grunt
  7. GRUNT_PHANTOMCSS=/usr/bin/grunt/node-modules/grunt-phantomcss
  12. # Remove old test results from public dir for specified test environment.
  13. find tests -wholename "**/$TEST_ENV/*.diff.png" -exec rm -f {} \;
  14. find tests -wholename "**/$TEST_ENV/*.fail.png" -exec rm -f {} \;
  16. # Execute test suite.
  17. ${GRUNT} phantom:$TEST_ENV

Once PhantomCSS completes running our test suite containing the 100 or so visual regression tests, created in part 1, verifying every content type and DOE office’s homepage, mobile interactivity, and search functionality, we use the Hudson Post build task module to ensure that the tests ran without Jenkins errors and if so copy over the test results to a web-accessible directory.

Jenkins Job #2: Resetting the Test Baselines

For our second Jenkins job, we needed to clear our test baselines and then call our first job to re-establish a new set of baselines. We again have this job parameterized with an environmental parameter. The job begins by executing a bash script where we remove all test images, including baselines, test results, and diffs.

  1. #!/bin/bash -ex
  2. export TERM=xterm
  3. export DRUSH_PHP=/usr/bin/php
  4. DRUSH=/opt/drush/drush
  5. GRUNT=/usr/bin/grunt
  10. # Remove all images.
  11. find tests -wholename "**/$TEST_ENV/*.png" -exec rm -f {} \;

Next, we use the Parameterized Trigger Plugin to trigger our first job, passing in the current environmental parameter. The first job detecting there are no baselines, will generate a fresh set for subsequent test runs. Using the Parameterized Trigger Plugin we are also able to take the precautionary measure of only running the first job per a successful run of our bash script to clear all screenshots.

DOE Visual Regression Jenkins

With the two new Jenkins jobs in place, we incorporate them into our already-existing staging and production deployment jobs. Additionally, we schedule our second job, re-establishing baselines, to run every night on integration providing us with quick to access and up-to-date baselines. Together with configuring our test suite to run after each pull-request is merged into integration, this provides developers with immediate feedback when a pull-request breaks integration.

With these jobs configured so, our Visual Regression test suite is fully automated. Immediately prior to every deployment to staging and production, without signal from a developer, our test suite establishes a fresh set of baselines. Immediately after, with our new code in place and caches cleared, our tests run again, generating a set of diffs for each test, highlighting all changes introduced from the deployment. Each change can then be reviewed by a developer to classify if the change is expected, due to work performed in the release, or if it is unexpected and a regression has been introduced. To better facilitate this process, an interface was put together.

Test Suite User-Interface

Using PHP, JavaScript, and CSS we designed a minimal interface to streamline the review process of test result diffs for developers. The PHP locates all of the images in the test directory and lists them on one half of the page, with a viewing area for selected results on the other half. Furthermore, each of our environments is listed, allowing us to filter our test results as needed. Upon clicking a test result, the diff is displayed instantly within the viewing area, allowing developers to quickly go through the test results.

DOE Visual Regression Test Suite UI 1

DOE Visual Regression Test Suite UI 2

Bringing together the number of technologies incorporated in our Visual Regression test suite was an exciting process and has since provided the Department of Energy with additional safeguards against regressions, bringing quality assurance to an all-time high. The PhantomCSS module, tying together PhantomJS, CasperJS, and ResembleJS, has allowed us to catch visual regressions across environments within the Department of Energy platform. Grunt and the grunt-phantomcss module has reduced the tedious need for developers to alternate running the test suite between multiple breakpoints and multiple environments. Finally, Jenkins has automated the entire process, running our visual regression test suite automatically upon deployments to staging and production, and nightly on integration. Through this Visual Regression testing infrastructure - and our always ongoing development of functional Behat tests - we can feel confident that the Department of Energy platform is secure from regressions and providing users with an exceptional experience.

Austin Corso