Thursday, January 28, 2010

The Unbearable Lowness of Defaults

While testing newly written PDF to text behaviour for a project, I noticed that the body text for some of the nodes wasn't displaying. The text was present in the database, would load into the editor and appeared in the preview, but the same text wouldn't show at all in the full node view, nor was it a non-rendered part of the html page.

Monday, January 25, 2010

Drupal: Exposing Data through Tokens

An often overlooked aspect of site development is the URL schema. The paths used to access a site form a type of interface; it's easier to remember that an index of all thesis pages exists at /collections/thesis than at /biblio/type/108. During development, a known and consistent schema can help with quick navigation during testing and remove the need to constantly look up exposed interfaces when implimenting the UI.

Wednesday, January 20, 2010

Drupal Actions: extending biblio to extract full text

The Bibliography (biblio) module for Drupal provides a convenient way to harvest records from other repositories and catalogues. A requirement for one project was to allow for searching across the full text content of digitally stored books, which is not always stored in other catalogues. The most direct approach was to grab a copy of the digital object (usually in PDF format), run a system level tool to extract the text and update the Biblio record to contain it. As with most things Drupal, the trick was to find the right place to hook this functionality in, and in this case I used Actions

Rather than modify the biblio module directly, I wanted to extend it separately if possible, to minimise the need to revisit the code every time biblio is updated. To this end I created the rather inelegantly named biblio_full_text module.