7.2 Concepts

Before we jump into specifics, I want to explain some important concepts that will help you understand how XSLT works. An XSLT processor (I'll call it an XSLT engine) takes two things as input: an XSLT stylesheet to govern the transformation process and an input document called the source tree. The output is called the result tree.

The XSLT stylesheet controls the transformation process. While it is usually called a stylesheet, it is not necessarily used to apply style. This is just a term inherited from the original intention of using XSLT to construct XSL-FO trees. Since XSLT is used for many other purposes, it may be better to call it an XSLT script or transformation document, but I will stick with the convention to avoid confusion.

The XSLT processor is a state engine. That is, at any point in time, it has a state, and there are rules to drive processing forward based on the state. The state consists of defined variables plus a set of context nodes, the nodes that are next in line for processing. The process is recursive, meaning that for each node processed, there may be children that also need processing. In that case, the current context node set is temporarily shelved until the recursion has completed.

The XSLT engine begins by reading in the XSLT stylesheet and caching it as a look-up table. For each node it processes, it will look in the table for the best matching rule to apply. The rule specifies what to output to build its part of the result tree, and also how to continue processing. Starting from the root node, the XSLT engine finds rules, executes them, and continues until there are no more nodes in its context node set to work with. At that point, processing is complete and the XSLT engine outputs the result document.

Let us now look at an example. Consider the document in Example 7-1.

Example 7-1. Instruction guide for a model rocket
<manual type="assembly" id="model-rocket">
  <parts-list>
    <part label="A" count="1">fuselage, left half</part>
    <part label="B" count="1">fuselage, right half</part>
    <part label="F" count="4">steering fin</part>
    <part label="N" count="3">rocket nozzle</part>
    <part label="C" count="1">crew capsule</part>
  </parts-list>
  <instructions>
    <step>
Glue <part ref="A"/> and <part ref="B"/> together to form the
fuselage.
    </step>
    <step>
For each <part ref="F"/>, apply glue and insert it into slots in the
fuselage.
    </step>
    <step>
Affix <part ref="N"/> to the fuselage bottom with a small amount of
glue.
    </step>
    <step>
Connect <part ref="C"/> to the top of the fuselage. Do not use
any glue, as it is spring-loaded to detach from the fuselage.
    </step>
  </instructions>
</manual>

Suppose you want to format this document in HTML with an XSLT transformation. The following plain English rules describe the process:

  1. Starting with the manual element, set up the "shell" of the document, in this case the html element, title, and metadata.

  2. For the parts-list element, create a list of items.

  3. For each part with a label attribute, create a li element in the parts list.

  4. For each part with a ref attribute, output some text only: the label and name of the part.

  5. The instructions element is a numbered list, so output the container element for that.

  6. For each step element, output an item for the instructions list.

The stylesheet in Example 7-2 follows the same structure as these English rules, with a template for each.

Example 7-2. XSLT stylesheet for the instruction guide
<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0"
>
  <xsl:output method="xml" encoding="ISO-8859-1"/>

  <!-- Handle the document element: set up the HTML page -->
  <xsl:template match="manual">
    <html>
      <head><title>Instructions Guide</title></head>
      <body>
        <h1>Instructions Guide</h1>
        <xsl:apply-templates/>
      </body>
    </html>
  </xsl:template>

  <!-- Create a parts list -->
  <xsl:template match="parts-list">
    <h2>Parts</h2>
    <dl>
      <xsl:apply-templates/>
    </dl>
  </xsl:template>

  <!-- One use of the <part> element: item in a list -->
  <xsl:template match="part[@label]">
    <dt>
      <xsl:value-of select="@label"/>
    </dt>
    <dd>
      <xsl:apply-templates/>
    </dd>
  </xsl:template>

  <!-- another use of the <part> element: generate part name -->
  <xsl:template match="part[@ref]">
    <xsl:variable name="label" select="@ref" />
    <xsl:value-of select="//part[@label = $label]" />
    <xsl:text> (Part </xsl:text>
    <xsl:value-of select="@ref" />
    <xsl:text>)</xsl:text>
  </xsl:template>

  <!-- Set up the instructions list -->
  <xsl:template match="instructions">
    <h2>Steps</h2>
    <ol>
      <xsl:apply-templates/>
    </ol>
  </xsl:template>

  <!-- Handle each item (a <step>) in the instructions list -->
  <xsl:template match="step">
    <li>
      <xsl:apply-templates/>
    </li>
  </xsl:template>

</xsl:stylesheet>

You will notice that each rule in the verbal description has a corresponding template element that contains a balanced (well-formed) piece of XML. Namespaces help the processor tell the difference between what is an XSLT instruction and what is markup to output in the result tree. In this case, XSLT instructions are elements that have the namespace prefix xsl. The match attribute in each template element assigns it to a piece of the source tree using an XSLT pattern, which is based on XPath.

A template is a mixture of markup, text content, and XSLT instructions. The instructions may be conditional statements (if these conditions are true, output this), content formatting functions, or instructions to redirect processing to other nodes. The element apply-templates, for example, tells the XSLT engine to move processing to a new set of context nodes, the children of the current node.

The result of running a transformation with the above document and XSLT stylesheet is a formatted HTML page (whitespace may vary):

<html>
  <head><title>Instructions Guide</title></head>
  <body>
    <h1>Instructions Guide</h1>
    <h2>Parts</h2>
    <dl>
      <dt>A</dt>
      <dd>fuselage, left half</dd>
      <dt>B</dt>
      <dd>fuselage, right half</dd>
      <dt>F</dt>
      <dd>steering fin</dd>
      <dt>N</dt>
      <dd>rocket nozzle</dd>
      <dt>C</dt>
      <dd>crew capsule</dd>
    </dl>
    <h2>Steps</h2>
    <ol>
      <li>
Glue fuselage, left half (Part A) and fuselage, right half (Part B)
together to form the fuselage.
      </li>
      <li>
For each steering fin (Part F), apply glue and insert it into slots in
the fuselage.
      </li>
      <li>
Affix rocket nozzle (Part N) to the fuselage bottom with a small
amount of glue.
      </li>
      <li>
Connect crew capsule (Part C) to the top of the fuselage. Do not use
any glue, as it is spring-loaded to detach from the fuselage.
      </li>
    </ol>
  </body>
</html>

As you see here, the elements in the source tree have been mapped to different elements in the result tree. We have successfully converted a document in one format to another. That is one example of XSLT in action.