Monday, January 25, 2010

Drupal: Exposing Data through Tokens

An often overlooked aspect of site development is the URL schema. The paths used to access a site form a type of interface; it's easier to remember that an index of all thesis pages exists at /collections/thesis than at /biblio/type/108. During development, a known and consistent schema can help with quick navigation during testing and remove the need to constantly look up exposed interfaces when implimenting the UI.

This is a feature that Drupal supports quite well through its clean URLs functionality and even better with the Pathauto module, which provides a mechanism for the creation of rules to automate the clean URL assignment. For the schema of my project's biblio data, I settled on a simple type/year/title layout for the URLs, so the URL provides a reasonable amount of information about the contents of the page without needing to visit it.

Pathauto accepts a number of replacement patterns - strings used to indicate the fields' values with which they should be substituted - which are provided by the Token module. Other modules, such as the Biblio module, can inject their own tokens, which can then be used by Pathauto (and any other module with token support, such as Automatic Nodetitles). The above schema contained within a 'collection' namespace would require a rule assigned to the Biblio type of collections/[biblio_type]/[biblio_year]/[title-raw].

With this rule in place, and some minor changes to the Pathauto defaults, adding new biblio content produces readable URLs:


Unfortunately, some of the full titles are exceptionally long, with a title such as The 'Birth of the Prison' and the Death of Convictism: The operation of law in pre-separation Queensland, 1839 to 1859 resulting in a truncated URL:


The Biblio module's short title field would be ideal for controlling the path result, allowing back end users to easily produce readable paths without  them requiring any administrative access to Pathautho's features. By default the biblio module only provides tokens for the publication year, the authors, and type id & name. So it's time to hook into Token and add some of our own...

As always, we start by creating a suitable .info file, then adding to the .module the initial hook to display what tokens are being added:

function biblio_tokens_token_list($type = 'all') {
    if ($type == 'node' || $type='all' ) {
        $tokens['node']['biblio_short_title'] = t("Biblio: Short title");
        $tokens['node']['biblio_short_title-raw'] = t("Biblio: Short title (raw)");
    return $tokens;

This adds two new tokens, biblio_short_title and biblio_short_title-raw (the differences will be explained below). The $type parameter indicates the context: tokens can be assigned to be available on nodes, users, comments etc. In this case, only Biblio objects have biblio_ fields, so the new tokens should only appear in node-related contexts. The 'all' parameter ensures the new replacement patterns will appear in any lists of all tokens.

Now that the new tokens have been identified, they need to be tied to actual data via the values hook:

function biblio_tokens_token_values($type, $object = NULL) {
    if ($type == 'node') {
        if ($object->type == 'biblio') {
            $clean = _clean(_short_title($object));
            $raw = _raw(_short_title($object));
        } else {
            $clean = _clean($object->title);
            $raw =  _raw($object->title);
        $values['biblio_short_title'] = $clean;
        $values['biblio_short_title-raw'] = $raw;
        return $values;

function _clean($str) { return check_plain($str); }
function _raw($str)   { return $str; }

function _short_title($biblio) {
    $title = $biblio->biblio_short_title;
    $title = empty($title) ? $object->title : $title;
    return $title;

The hook function biblio_tokens_token_values only has to deal with node types; if it's a biblio object use the short title if present otherwise use the title, if it's any other kind of node, just use the title.

Both 'clean' and 'raw' versions of the title are returned. A cleaned title has simply passed through Drupal's check_plain function, which transforms special characters into plain text, while a raw one has not. Because Pathauto performs its own such checks (it can't be reliant on users for generating valid paths), it is encouraged to use -raw variants of tokens with it. (In the example above, _raw is just a simple passthrough function; while a previous version included some minor transforms that have since been removed, I felt the passthrough helped indicate the intent of the code.)

Once the module is enabled, the two new tokens of [biblio_short_title] and [biblio_short_title-raw] can be seen in the replacement patterns list of all node-related features that support Token.

No comments:

Post a Comment