4.2.1 Cocoon and cross-media publishing

By mchoate
Last modified: 2006-09-04 13:26:24

I have recently been working on a project to reformat existing HTML pages into PDF pages as a way to facilitate the sale of reprints for newspaper and magazine publishers. I have used this project as an excuse to learn Cocoon and have been very pleased with what I have learned.

Cocoon calls itself an XML publishing framework, and this is the right term, because it's not a content management system (there is a content management system based on Cocoon, which is Lenya), but a tool for leveraging the inherent flexibility of XML by transforming source data into a variety of formats. Transforming data is central to Cocoon...hence the name "Cocoon".

This transformation takes place in what Cocoon refers to as the pipeline. At one end of the pipeline, Cocoon generates the source data, either from an xml file, a database, or just about any source imaginable that can be turned into "Sax" events (Sax is a commonly-used API for parsing XML that generates events at critical moments during the parsing of the file, such as when it encounters a new element, reaches the end of the document, etc.). This stream of data then goes through one or more transformers, which usually means processing the source data by applying a XSLT stylesheet. Once transformed, the data must be serialized - the XML "tree" that is in memory must be converted to text and stored in a file, sent to a Web server, or something similar. Out of the box, Cocoon supports XML, XHTML, WML, PDF, SVG and text.

What I have found most interesting is that Cocoon need not replace your current content management system - it works perfectly well as an adjunct to it (especially if it produces solid XML, or clean HTML). In fact, it might even provide the most benefit for cross-media publishers as a kind of "glue" system, that links print editorial systems, with their online counterparts, archives, and third-party syndicates.

Once I have completed the re-print application, I will write an article that identifies the entire process. It's a good introduction to the flexibility that Cocoon offers, and might help generate ideas about how to use it in other situations.