E.5 XSLT

One of the most important technologies to come out of the W3C is eXtensible Stylesheet Language Transformations (XSLT). XSLT provides a way to transform one type of XML document into another using a language written entirely in XML. XSLT works by allowing developers to create one or more template rules that are applied to the various elements in the source document to produce a second, transformed document.

While the basic concept behind XSLT is quite simple (apply these rules to the elements that match these conditions), the finer points of writing good XSLT stylesheets is a huge topic that we could never hope to cover here. We will instead provide a small example that illustrates the basic XSLT syntax.

First, though, we need to configure AxKit to transform XML documents using an XSLT processor. For this example, we will assume that you already have the GNOME XSLT library (libxml2 and libxslt, available at http://xmlsoft.org/) and its associated Perl modules (XML::LibXML and XML::LibXSLT) installed on your server.

Adding this line to your httpd.conf file tells AxKit to process all XML documents with a stylesheet processing instruction whose type is "text/xsl" with the LibXSLT language module:

AxAddStyleMap text/xsl Apache::AxKit::Language::LibXSLT

E.5.1 Anatomy of an XSLT Stylesheet

All XSLT stylesheets contain the following:

  • An XML declaration (optional)

  • An <xsl:stylesheet> element as the document's root element

  • Zero or more template rules

Consider the following bare-bones stylesheet:

<?xml version="1.0"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">
  <xsl:template match="/">
    <!-- the content for the output document contained here -->
  </xsl:template>
</xsl:stylesheet>

Note that the root template (defined by the match="/" attribute) will be called without regard for the contents of the XML document being processed. As such, this is the best place to put the top-level elements that we want to include in the output of each and every document being transformed with this stylesheet.

E.5.2 Template Rules and Recursion

Let's take our basic stylesheet and extend it to allow us to transform the DocBook XML document presented in Example E-8 into HTML.

Example E-8. camelhistory.xml
<?xml version="1.0"?>
<book>
<title>Camels: An Historical Perspective</title>
<chapter>
  <title>Chapter One</title>
  <para>
     It was a dark and <emphasis>stormy</emphasis> night...
  </para>
</chapter>
</book>

First we need to alter the root template of our stylesheet:

<xsl:template match="/">
  <html>
    <head><xsl:copy-of select="/book/title"/></head>
    <body>
      <xsl:apply-templates/>
    </body>
  </html>
</xsl:template>

Here we have created the top-level structure of our output document and copied over the book's <title> element into the <head> element of our HTML page. The <xsl:apply-templates/> element tells the XSLT processor to pass on the entire contents of the current element (in this case the <book> element, since it is the root-level element in the source document) for further processing.

Now we need to create template rules for the other elements in the document:

<xsl:template match="chapter">
  <div class="chapter">
    <xsl:attribute name="id">chapter_id<xsl:number
    value="position( )" format="A"/></xsl:attribute>
    <xsl:apply-templates/>
  </div>
</xsl:template>
<xsl:template match="para">
  <p><xsl:apply-templates/></p>
</xsl:template>

Here we see more examples of recursive processing. The <para> and <chapter> elements are transformed into <div> and <p> elements, and the contents of those elements are passed along for further processing. Note also that the XPath expressions used within the template rules are evaluated in the context of the current element being processed. XSLT also maintains what is called the "current node list," which is the list of nodes being processed. In the example above, this is the list of all chapter elements. This is an example of XSLT using "least surprise".

While this sort of recursive processing is extremely powerful, it can also be quite a performance hit[3] and is necessary only for those cases where the current element contains other elements that need to be processed. If we know that a particular element will not contain any other elements, we need to return only that element's text value.

[3] Although, since XSLT engines tend to be written in C, they are still very fast (often faster than most compiled Perl templating solutions).

<xsl:template match="emphasis">
  <em><xsl:value-of select="."/></em>
</xsl:template>
<xsl:template match="chapter/title">
  <h2><xsl:value-of select="."/></h2>
</xsl:template>
<xsl:template match="book/title">
  <h1><xsl:value-of select="."/></h1>
</xsl:template>
</xsl:stylesheet>

Look closely at the last two template elements. Both match a <title> element, but one defines the rule for handling titles whose parent is a book element, while the other handles the chapter titles. In fact, any valid XPath expression, XSLT function call, or combination of the two can be used to define the match rule for a template element.

Finally, we need only save our stylesheet as docbook-snippet.xsl. Once our source document is associated with this stylesheet (see Section E.6 later in this appendix), we can point our browser to camelhistory.xml, and we'll see the output generated by the code in Example E-9.

Example E-9. camelhistory.html
<?xml version="1.0"?>
<html>
  <head>
    <title>Camels: An Historical Perspective</title>
  </head>
  <body>
    <h1>Camels: An Historical Perspective</h1>
    <div class="chapter" id="Chapter One">
      <h2>Chapter One</h2>
      <p>
         It was a dark and <em>stormy</em> night...
      </p>
    </div>
  </body>
</html>

The entire stylesheet is rendered in Example E-10.

Example E-10. docbook-snippet.xsl
<?xml version="1.0"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="1.0">

 <xsl:template match="/">
  <html>
    <head><xsl:copy-of select="/book/title"/></head>
    <body>
      <xsl:apply-templates/>
    </body>
  </html>
 </xsl:template>

 <xsl:template match="chapter">
  <div class="chapter">
    <xsl:attribute name="id">chapter_id<xsl:number
    value="position( )" format="A"/></xsl:attribute>
    <xsl:apply-templates/>
  </div>
 </xsl:template>
 <xsl:template match="para">
  <p><xsl:apply-templates/></p>
 </xsl:template>

 <xsl:template match="emphasis">
  <em><xsl:value-of select="."/></em>
 </xsl:template>
 <xsl:template match="chapter/title">
  <h2><xsl:value-of select="."/></h2>
 </xsl:template>
 <xsl:template match="book/title">
  <h1><xsl:value-of select="."/></h1>
 </xsl:template>
</xsl:stylesheet>

E.5.3 Learning More

We have only scratched the surface of how XSLT can be used to transform XML documents. For more information, see the following resources:

  • The XSLT specification: http://www.w3.org/TR/xslt

  • Miloslav Nic's XSLT reference: http://www.zvon.org/xxl/XSLTreference/Output/index.html

  • Jeni Tennison's XSLT FAQ: http://www.jenitennison.com/xslt/index.html



    Part I: mod_perl Administration
    Part II: mod_perl Performance
    Part VI: Appendixes