OpenCalais on Drupal 7

OpenCalais has been completely rewritten for Drupal 7. We chose to basically start from scratch because of the new (for d7) entity and fields systems, which radically changed the way that we interact with taxonomy terms, and in the hope to vastly improve the UI and be able to add more functionality. OpenCalais now uses (slightly modified) taxonomy autocomplete fields along with automatically created and curated taxonomies to maintain the tags base. This move has allowed us much greater flexibility in how we configure OpenCalais and tag nodes.

h3.

Josh Caldwell, Developer
#Drupal | Posted

OpenCalais has been completely rewritten for Drupal 7. We chose to basically start from scratch because of the new (for d7) entity and fields systems, which radically changed the way that we interact with taxonomy terms, and in the hope to vastly improve the UI and be able to add more functionality. OpenCalais now uses (slightly modified) taxonomy autocomplete fields along with automatically created and curated taxonomies to maintain the tags base. This move has allowed us much greater flexibility in how we configure OpenCalais and tag nodes.

Per Content-Type Configuration

Because of these changes, we have per-content type configuration for OpenCalais. This will allow one to tag their article nodes differently than the configuration of the blog posts. My hope is that this can lead to much richer and more meaningful tagging without your site suddenly becoming swamped in a hundred thousand taxonomy terms.

The new configuration screen is located in the local task tabs in each content type’s edit area. The UI is revamped to use sliders to allow you to set the threshold of relevancy you want a term to meet before being suggested – each category can be turned on or off and individually configured with its own threshold. Turning a category on or off will automatically add or remove a field on your content type to represent that category. These fields don’t need to be (and really shouldn’t be) edited manually, they will be curated by OpenCalais – so you don’t need to worry about them.

Presets

Because there are so many categories, it seemed clear that we needed a ‘quick settings’ paradigm for the configuration of content types. With this in mind, we created OpenCalais Presets. Presets are ctools exportable and represent a set of categories with their corresponding thresholds which can be quickly applied to content type. We didn’t stop there though, we’ve also made it so that presets can be mix and matched. You can apply a GeoTagging preset (for example) and also apply a preset for social tags and they will simply merge together, applying all the categories in each, with the second taking precedence in terms of threshold. This can be repeated as much as you like. There is (also) the option to apply the preset in a strict sense, deleting fields not included in it. You may also manually add any fields you like and even remove fields the preset applied. In this way, the presets paradigm gives much more flexibility than simply using features.

Filters Framework

The next thing we added was a generalized way to ‘filter’ the tags returned by OpenCalais before they are shown to the user or automatically applied. The way we did this is through the definition of hook_opencalais_filter_info which returns an associative array of callback functions. For example, OpenCalais by default implements one filter – a global blacklist – which looks like this:

<span style="color: #000000"><span style="color: #0000BB"><?php<br></span><span style="color: #FF8000">/**<br> *  Define a filter type<br> */<br></span><span style="color: #007700">function </span><span style="color: #0000BB">opencalais_opencalais_filter_info</span><span style="color: #007700">(){<br>  return array(<br>    </span><span style="color: #DD0000">'global_blacklist' </span><span style="color: #007700">=> array( </span><span style="color: #FF8000">//the key provides the machine name of the filter<br>      </span><span style="color: #DD0000">'title' </span><span style="color: #007700">=> </span><span style="color: #0000BB">t</span><span style="color: #007700">(</span><span style="color: #DD0000">'Blacklist'</span><span style="color: #007700">), </span><span style="color: #FF8000">//the title as shown to a user<br>      </span><span style="color: #DD0000">'description' </span><span style="color: #007700">=> </span><span style="color: #0000BB">t</span><span style="color: #007700">(</span><span style="color: #DD0000">'Use a taxonomy vocabularly as a global black list. Terms in the vocabulary will not be applied.'</span><span style="color: #007700">), </span><span style="color: #FF8000">//description as shown to the user<br>      </span><span style="color: #DD0000">'callback' </span><span style="color: #007700">=> </span><span style="color: #DD0000">'opencalais_filters_blacklist'</span><span style="color: #007700">, </span><span style="color: #FF8000">//the callback to filter the terms – has signature: function opencalais_filters_blacklist(&$terms, $content_type)<br>      </span><span style="color: #DD0000">'configuration' </span><span style="color: #007700">=> </span><span style="color: #DD0000">'opencalais_filters_blacklist_config' </span><span style="color: #FF8000">//the callback to get the configuration form is one is needed for the filter<br>    </span><span style="color: #007700">),<br>  );<br>} <br></span><span style="color: #0000BB">?></span></span>

Our hope is that by opening up filters to other modules, there will be several add-on modules that provide filters and may be installed as needed.

Adding Fields Programmatically

One of the things we explored as we built OpenCalais for Drupal7 is adding and removing fields programatically. This creates a paradigm of ‘managed fields’ which the user doesn’t need to worry about, which don’t need to be exported into features (but can be if desired). It also creates a nice way for end users of the site to be able to updating things without having to go through the field configuration screen, which can often be confusing. The current pitfall is that these fields show up in the manage fields screen and can be manually edited / changed – potentially breaking functionality. While this is good for expert users, it might be good to find a way to hide these fields from most administrative users.

Our hope for the future is to refine this idea, as it clearly can solve many problems faced in Drupal, and possibly genericize the adding a boolean checkboxes into the Mark module. My hope is that this paradigm can be used to create easy to use modules that ‘just work’ and are more performant than other options.

Josh Caldwell

Developer