Chapter 7. Transformation with XSLT

Transformation is one of the most important and useful techniques for working with XML. To transform XML is to change its structure, its markup, and perhaps its content into another form. A transformed document may be subtly altered or profoundly changed. The process is carefully shaped with a configuration document variously called a stylesheet or transformation script.

There are many reasons to transform XML. Most often, it is to extend the reach of a document into new areas by converting it into a presentational format. Alternatively, you can use a transformation to alter the content, such as extracting a section, or adding a table of numbers together. It can even be used to filter an XML document to change it in very small ways, such as inserting an attribute into a particular kind of element.

Some uses of transformation are:

  • Changing a non-presentational application such as DocBook into HTML for display as web pages.

  • Formatting a document to create a high-quality presentational format like PDF, through the XSL-FO path.

  • Changing one XML vocabulary to another, for example transforming an organization-specific invoice format into a cross-industry format.

  • Extracting specific pieces of information and formatting them in another way, such as constructing a table of contents from section titles.

  • Changing an instance of XML into text, such as transforming an XML data file into a comma-delimited file that you can import into Excel as a spreadsheet.

  • Reformatting or generating content. For example, numeric values can be massaged to turn integers into floating point numbers or Roman numerals as a way to create your own numbered lists or section heads.

  • Polishing a rough document to fix common mistakes or remove unneeded markup, preparing it for later processing.

They may seem like magic, but transformations are a very powerful and not too complicated way to squeeze more use out of your XML.