We recently had the opportunity to work with Thomson-Reuters on a rather interesting project - create a custom-built editorial suite on top of Drupal, with an eye towards eventually building out a multi-site platform - and build it in three months, just in time for the 2012 London Olympics.
Reuters would be pushing hundreds of articles and thousands of photos into the site every day, with high-volume days (like the opening ceremonies, or the Men’s 100m final) doubling the volume. Reuters uses a feed management engine called MediaConnect to serve out photos and text articles to their clients, and our site would be ingesting this content to populate the site.
The content model was complex - a single article could consist of dozens of photos along with article text, and this text was subject to repeated updates. For example: Images start to arrive with a slugline specific to the Men's 100m final, and as each is ingested they are appended to a slideshow that was published when the first image with a unique slugline arrived. Then, a text item arrives with the slugline that consists of a short write-up of the final results, and that item is in turn integrated into the article, which now consists of a multi-image slideshow and a text article. Later, more images arrive and are automatically appended, and an update to the text item (perhaps a final, longer story) overwrites the original text item.
All of this had to have minimal latency, of course.
We built an ingestion engine that took in the MediaConnect content stream, converted the text and photos to type-specific entities, and stitched together the associated items into an article instantiated as a complex node that could support hundreds of content associations via a system of content manifests. This content fed a network of landing pages specific to sports, events, athletes, and countries that had been built out to receive automatic content streams. The end goal was a site that, once hooked up to the MediaConnect Olympics content channel, could be populated with no editorial intervention whatsoever.
That was the easy part
Well, it wasn’t actually easy. But it was straightforward - the content was all encapsulated in well-formed XML, and although the ingestion and assembly rules were complex, once in place the system would run.
Harder, was allowing human intervention. The editorial use cases were legion, and as is the case with any high-profile news site, especially one covering a high-profile event like the Olympics, the qualitative differences between one photo and another, one version of a headline and another, were subtle but crucial. MediaConnect (and our system) lay at the end of a chain of editorial content management systems, each of which applied a layer of curation, so the content stream was not a raw torrent of news (for a look at what the photo editors experienced, see "A Glimpse Into the Hectic Life of a Reuters Photo Editor at the Olympics"). Even so, the editorial team required a high degree of granular editorial control over the story assembly process. If a story received 50+ photos, an editor would want to lock the best one at the top while still allowing automatic ingestion of more. Updates would come in for articles that had already been edited, but editors would want to review these updates to ensure that their own corrections were not overwritten. Articles built by hand would need to be ready to receive automated content at any time.
Drupal provided many, many advantages to us in building out the editorial toolkit that the Reuters team in London used to manage the constant ebb and flow of content management. We used many contributed modules
(Workbench, Image Cache, Contentlock, among others) as well as our own custom modules, but at the core of our strategy was the creation of an editorial UI that we layered on top of Drupal, allowing editors, authors, and other admins the ability to manage the content of the site without having to enter the Drupal admin context. This UI took several forms.
A CMS dashboard was presented to each user after logging in, and it provided direct access to workflow states, lists of content with pending MediaConnect write-through updates that required review and approval, and lists of content locked against edit. It also gave each editor a list of sections that they had been assigned responsibility for, each linked to the curation administration toolkit for that section.
Although the primary processes of building content on the site were automated, there is always a need for editors to insert themselves into the article assembly process. To do this, they need access to the same functions, processes, and tools that the system uses to build articles. We provided this to them in the form of the Content Builder, which is integrated into the article node edit form in the shape of a toolbar that slides out from the side. Its centerpiece is a search utility that can search either locally or on MediaConnect for text, images, and video. Once a search return is obtained, an editor can drag content from the results into the node form, inserting photos into the article’s slideshow for example, or adding a video. If the selected item was found on MediaConnect, it is ingested through the same content gateway used to process the incoming feeds of MediaConnect content, and the asset now lives locally.
Section Fronts and Collections
To handle the dual duties of publishing automated topical feeds into the section fronts while allowing for manual curation, we created a custom ‘Collections’ module that allows for taxonomy-driven light-weight queries (no Views here) with allowance for ‘pinning’ at the top of the list. The UI that we built to manage the Collections has preview functions that can be hooked into editorial workflow processes.. Additionally, the Content Builder tool was added to the Collections UI, allowing editors to search for local Drupal articles and drag them into the section, thus pinning them.
This was an exciting project to work on, with the tools and processes we built in place, it was really great to see the Reuters London 2012 team work their magic and keep us all up to date on the latest Olympic headlines.