Thursday, January 28, 2010

The Unbearable Lowness of Defaults

While testing newly written PDF to text behaviour for a project, I noticed that the body text for some of the nodes wasn't displaying. The text was present in the database, would load into the editor and appeared in the preview, but the same text wouldn't show at all in the full node view, nor was it a non-rendered part of the html page.

After some trial and error, I discovered that the problem was happening on the nodes where the body text was greater than 32KB in size. The moment it was reduced down to or below that value, it would render correctly. With one extra character it would stop.

Searching through Drupal's issues, I eventually discovered something that seemed to be relevant: the line break converter input filter has been known to display nothing for existing text body. But while this was the source of the problem in Drupal, it wasn't the root cause.

The Perl Compatible Regular Expressions (PCRE) library is a C implemented library that has been integrated into PHP to provide performant regexp handling. While the PCRE devs had set the defaults for two configuration values - the backtrack and recursive limits - to 10,000,000, the PHP devs had chosen a significantly lower value for them of 100,000. As a comment in the (three years old!) PHP bug report mentions:
The low limit causes scripts which hit it to fail without a warning, notice or error
To counter this issue, simply modify the appropriate lines in php.ini to:
pcre.backtrack_limit=1000000
pcre.recursion_limit=1000000
If you don't have access to your server's php config, you can set the directives in your Drupal site's settings.php:
ini_set('pcre.backtrack_limit', 1000000);
ini_set('pcre.recursion_limit', 1000000);
The silent failure made this very hard to track down, requiring a lot of google-fu to work out how best to describe the (absence of) behaviour.

No comments:

Post a Comment