Moving to a new software system involves much more than building and configuring the new digs. Infrastructure, staff training, and content migration are all significant headaches. Luckily Drupal 8 has you covered with new architecture and functionality to facilitate the transition. Today we’ll discuss the challenge of automatically moving your content.
Step 1: Get the Content
All content is data. Data presents three big challenges in order to use or move it in a new site:
- Access to the data can be hindered by policy or technology. Can the content be cleanly exported into files that can be consumed by other systems? Is it running in a live database that will answer queries from a migration script? Can it be offered via a robust API that can respond to the hundreds or thousands of API requests necessary to relay the content through the testing stages and final migration process?
- The structure and format of the data can be idiosyncratic, inconsistent, or lacking critical semantics needed for an external system to parse and understand what’s been provided. People can spend extra time to make the consumer of the data more intelligent, but it’s best if your CSV, XML, or JSON data, or running database present a clear and usable model for the content. Simple things like ensuring the presence of column headers in a CSV file can save hours or days of extra analysis and development time.
- The values of the data can be individually presented in a difficult to understand format. For a field about temperature, is it recorded with the unit of Celsius or Fahrenheit? If the field is a date, is it presented in unixtime, ISO 8601 or RFC 3339, or simply Month/Day/Year format? Understanding the formats and use cases of the existing data will be key in understanding how to adapt that data into the new site.
Retrieving content is often regarded as a one-time process. In practice the transfer of the content might be a simple file download, but transferring the necessary context and understanding can be an extended investigation and series of interviews with domain specialists.Drupal 8’s migration system (based on the amazing Migrate module from previous versions of Drupal) has a flexible, extensible architecture that allows it to retrieve, parse, and understand data values of all kinds. Where Drupal cannot natively handle your content, its object-oriented architecture facilitates least-cost to create a tailored solution for the specific pieces that are not well-supported.Is Drupal 8 missing some migration features you remember from the Migrate module in Drupal 7? Pulling contributed modules into Drupal core is an intensive scoping and polishing process that does not automatically incorporate all the functionality. Migrate module maintainer Mike Ryan has already gotten the ball rolling on the missing pieces with the Migrate Plus module. To read more about the details of migration that are included in Drupal Core, check out the Migrate API documentation.
Now that we understand the nature of our “source” content, we need to adjust it so we can pull it into the data model of our new site.
Step 2: Transform the Content
The new website has a specific data model in mind for all content. Incoming data from a migration process needs to be adjusted to conform to that model. Furthermore, this is the perfect moment to automate any transformations to optimize for the use case of that data in this system. A great example of a use case error: press releases scrambling the publication date of the content for the release date of the news. A common problem to be addressed in the transformation step is the handling of dates. Often the source site or the new site will have different needs around the timestamp of a date. As a result, the migration process may be responsible for reducing the granularity of date data to remove timestamps. Other times, the migration needs to introduce approximations of date data so the existing content will be compatible with plans for how to present future content. Identifying business needs to this level is almost always an exercise for the spreadsheets.
Step 3: Save the Content
Once the content and all its pieces have been adjusted for the new system, it’s time to construct a new data record that can be saved. Drupal’s standard process will break down the content into the various discrete fields of the new site so your carefully determined structure will be initialized with your existing content.Drupal 8’s entity API is more complete than in Drupal 7; now an entity’s fields are validated by the save process. In the past, validation of the entity model as part of the entity save API was spotty at best, leading to mysterious migration errors and broken content that had to be manually detected. Migrations will no longer need to rely on ad hoc validation logic, which will accelerate the creation and troubleshooting of new migrations. Drupal’s developer documentation contains more details on the Entity Validation API if you are interested in how you might use it outside of migrations. Because the migration is leveraging the same code and processes used for regular content creation, all migrated content will be functional as part of the site as though it had been manually created. Careful review is still needed, as the styling of the HTML markup may need to be adjusted to fit the new site’s theme. Furthermore, if the new site has some underlying content strategy changes, viewing the old content in place may surface some ideas around making improvements. Make sure the launch timeline has space for this kind of review.
Who Cares About Migrations?
Concern about migration and the mental model of splitting it into the three major pieces above is certainly not unique to sites migrating into their first Drupal site. Migration is a critical concern for any system change. Over the years Drupal’s practice of aggressive reinvention has spawned the realization that really everyone should be concerned with migration.It’s rare to prepare your upgrade of an existing Drupal site to the next major version as an effort in one-for-one conversion. Each new version of Drupal brings so many new capabilities that it’s the perfect time to refresh the site, apply new thinking to how it meets your digital goals, and build a more solid foundation to meet the expectations of web users. Any new Drupal site is first and foremost a new website.For this reason, Drupal 8 has dropped the notion of an upgrade path. You do not “upgrade” to a major version of Drupal, instead you set up your Drupal site and migrate your content into it. Drupal 8 ships with out-of-box support for Drupal 6 and Drupal 7, which are a great help to jumpstart this effort.
You Might Need a Content Studio
If you are embarking on a significant new Drupal 8 site you should consider if it’s time to rethink your content, or even your overall content strategy. Drupal has always been at the forefront of structured content modeling, and the toolkit has only expanded with this new version. This is a prompt to decide: is now the time to start aiming your system to support a mobile app, or an even broader omni-channel strategy?If that kind of transition is in the works, you should not wait for the site to be completed to rewrite your content.We generally recommend against trying to build our new content in an under-production website. This is akin to moving into a house when the walls are up but before the roof has been put on top—it’s uncomfortable and you might slow down the construction crew.Lately we’ve been experimenting with hosted solutions such as GatherContent (with their easy-to-use Drupal integration module) to provide a content studio. This supplies effective tools to a content team and allows development to stay focused. Think of it as a long-term hotel, which gives you space to continue content operations.
Finding the Least-Cost Solution
In all this discussion of how to put together an automatic migration in Drupal, it’s worth stepping back to make sure it’s the right decision. There is always a cost in developer & QA time to write migration software and validate the output, but because the migration is really a one-time operation, you do not necessarily need an automatic solution. Maybe the migration can be completed by a team of junior staff with keyboards.We determine recommendations around this on a case-by-case basis, because every site has different overhead in automation costs. A broken down estimate by content type is effective at identifying specific types of content that might be cost effective for automation. The evaluation is a measurement of the flat cost to build the automatic migration against the per page cost for manual migration.
As shown in the chart above presenting an imaginary migration analysis, the cost to manually migrate 300 pages matches the cost of our flat programmatic migration estimate. (Automatic migrations do not become more costly by adding more pages, except to the extent it increases the runtime of the migration from minutes to hours or days.) It is around this inflection point we would start considering an automatic migration recommendation. Are you interested in a personal consultation regarding the cost of manual vs. automatic migration? Phase2 provides expert consultations on if, when, and how your team should migrate to Drupal 8. Get in touch.