XSLT stylesheets are collections of templates. Each template associates a condition (e.g., an element in the source tree with a particular attribute) with a mixture of output data and instructions. These instructions refine and redirect processing, extending the simple matching mechanism to give you full control over the transformation.
A template does three things. First, it matches a class of node. The match attribute holds an XSLT pattern which, much like an XPath expression, matches nodes. When an XSLT processor is told to apply templates to a particular node, the processor runs through all the templates in the stylesheet and tests whether the node matches the template's pattern. All the templates that match this node are candidates for processing, and the XSLT processor must select one.
Second, the template contributes a priority value to help the processor decide which among eligible templates is the best to use. The template that matches the current node with the highest import precedence, or highest priority, is the one that will be used to process it. Different factors contribute to this priority. A template with more specific information will overrule one that is more generic. For example, one template may match all elements with the XPath expression *. Another may match a specific element, while a third matches that element and further requires an attribute. Alternatively, a template can simply state its precedence to the processor using a priority attribute. This is useful when you want to force a template to be used where otherwise it would be overlooked.
The third role of a template is to specify the structure of the result tree. The template's content actually contains the elements and character data to be output in the result tree. So it is often possible to see, at a glance, how the result tree will look. XSLT elements interspersed throughout this content direct the processing to other templates.
This model for scripting a transformation has strong benefits. Templates are (usually) compact pieces of code that are easy to read and manage, like functions in a programming language. The match and priority attributes show exactly when each template is to be used. Transformation stylesheets are modular and can be combined with others to enhance or alter the flow of transformation.
The XSLT patterns used inside the match attributes of template elements are a subset of XPath expressions. The first restriction on XSLT patterns is that only descending axes may be used: child and attribute. The shorthand // can be used but it's not expanded. It simply would not make sense to use other axes in XSLT patterns.
The second difference is that paths are actually evaluated right to left, not the other direction as is usual with XPath. This is a more natural fit for the XSLT style of processing. As the processor moves through the source tree, it keeps a running list of nodes to process next, called the context node set. Each node in this set is processed in turn. The processor looks at the set of rules in the stylesheet, finds a few that apply to the node to be processed, and out of this set selects the best matching rule. The right-to-left processing helps the XSLT engine prioritize eligible templates.
Suppose there is a rule with a match pattern chapter/section/para. To test this pattern, the XSLT engine first instantiates the node-to-process as the context node. Then it asks these questions in order:
Is the context node an element of type para?
Is the parent of this node an element of type section?
Is the grandparent of this node an element of type chapter?
Logically, this is not so different from traditional XPath processing, which usually starts from some absolute node and works its way into the depths of the document. You just have to change your notion of where the path is starting from. It might make more sense to rewrite the match pattern like this:
abstract-node/child::chapter/child::section/child::para
where abstract-node is some node such that a location path extending from it matches a set of nodes that includes the node-to-process.
It is possible for more than one rule to match a node. In this case, the XSLT processor must select exactly one rule from the mix, and that rule should meet our expectations for best match. Here are the rules of precedence among matching patterns:
If the pattern contains multiple alternatives separated by vertical bars (|), each alternative is treated with equal importance, as though there were a separate rule for each.
A pattern that contains specific hierarchical information has higher priority than a pattern that contains general information. For example, the pattern chapter/section/para is more specific than para, so it takes precedence.
A wildcard is more general than a specific element or attribute name and therefore has lower priority. The pattern stuff takes priority over the wildcard pattern *. Note that this is not true when hierarchical information is included. stuff/cruft has exactly the same priority as stuff/* because they both specify hierarchical information about the node.
A pattern with a successful test expression in square brackets ([ ]) overrides a pattern with no test expression but that is otherwise identical. So bobo[@role="clown"] has higher priority than bobo. Again, this only works when no hierarchical information is included. circus/bobo and circus/bobo[@role="clown"] have the same priority.
Other information, such as position in the stylesheet, may be used to pare down the set if there is still more than one rule remaining.
The basic assumption is that rules that are more specific in their application take precedence over rules that are more general. If this were not the case, it would be impossible to write catch-all rules and default cases. Position and order don't come into play unless all other means of discrimination fail. It's up to the transformation processor to determine how to handle the final tie-breaking.
The xsl:template element has an optional priority attribute that can be set to give it precedence over other rules and override the process of determination. The value must be a real number (i.e., it must have a decimal point unless it is zero) and can be positive, negative, or zero. A larger number overrides a smaller number.
XSLT defines a set of default rules to make the job of writing stylesheets easier. If no rule from the stylesheet matches, the default rules provide an emergency backup system. Their general behavior is to carry over any text data in elements from the source tree to the result tree, and to assume an implicit xsl:apply-templates element to allow recursive processing. Attributes without matching templates are not processed. The following list sums up the default rules for each type of node:
Processing starts at the root. To force processing of the entire tree, the default behavior is to apply templates to all the children. The rule looks like this:
<xsl:template match="/"> <xsl:apply-templates/> </xsl:template>
We want the processor to touch every element in the tree so it does not miss any branches for which rules are defined. The rule is similar to that for the root node:
<xsl:template match="*"> <xsl:apply-templates/> </xsl:template>
Attributes without matching templates are simply ignored:
<xsl:template match="@*"/>
It is inconvenient to include the xsl:value-of element in every template to output text. Since we almost always want the text data to be output, it is done by default:
<xsl:template match="text( )"> <xsl:value-of select="."/> </xsl:template>
By default, these nodes are left out. The rule is this:
<xsl:template match="processing-instruction( )"/>
Comments are also omitted from the result tree by default:
<xsl:template match="comment( )"/>
The template model of transformation creates islands of markup separate from each other. We need some way of connecting them so that processing continues through the document. According to the default rules, for every element that has no matching template, the XSLT engine should output its text value. This requires processing not only its text nodes, but all the descendants in case they have text values too.
If a template does match an element, it is not required to do anything with the element or its content. In fact, it is often the case that you want certain elements to be ignored. Perhaps they contain metadata that is not to be included with the formatted data. So you are allowed to leave a template empty. Here, the element ignore-me will be passed over by the XSLT processor (unless another rule matches with higher priority):
<xsl:template match="ignore-me"/>
Unless you explicitly tell the XSLT engine how to proceed with processing in the template, it will go no further. Instead, it will revert to the context node set and evaluate the next node in line. If you do want processing to go on to the children, or you want to insert other nodes to process before the next node in the context set, there are some directives at your disposal.
The element apply-templates interrupts the current processing in the template and forces the XSLT engine to move on to the children of the current node. This enables recursive behavior so that processing can descend through the tree of a document. It is called apply-templates because the processor has to find new templates to process the children.
The first template in Example 7-2 contains an apply-templates element:
<xsl:template match="manual"> <html> <head><title>Instructions Guide</title></head> <body> <h1>Instructions Guide</h1> <xsl:apply-templates/> </body> </html> </xsl:template>
When processing this template, the XSLT engine would first output the markup starting from the html start tag all the way to the end tag of the h1 element. When it gets to the xsl:apply-templates element, it jumps to the children of the current (manual) element and processes those with their own templates: the attributes type and id, then the elements parts-list and instructions. After all these have been processed, the XSLT engine returns to its work on the above template and outputs the end tags for body and html.
Suppose that you did not want to handle all the children of a node, but just a few. You can restrict the set of children to process using the attribute select. It takes an XPath location path as its value, giving you a rich assortment of options. For example, we could rewrite the second template in Example 7-2 like so:
<xsl:template match="manual"> <html> <head><title>Parts List</title></head> <body> <h1>Parts List</h1> <xsl:apply-templates select="parts-list"/> </body> </html> </xsl:template>
Now only the parts-list element will be processed. All other children of manual, including its attributes and the instructions element, would be skipped. Alternatively, you can skip a particular element type like this:
<xsl:template match="manual"> <html> <head><title>Assembly Steps</title></head> <body> <h1>Assembly Steps</h1> <xsl:apply-templates select="not(parts-list)"/> </body> </html> </xsl:template>
And everything but the parts-list element will be handled.
|
The for-each element creates a template-within-a-template. Instead of relying on the XSLT engine to find matching templates, this directive encloses its own region of markup. Inside that region, the context node set is redefined to a different node set, again determined by a select attribute. Once outside the for-each, the old context node set is reinstantiated.
Consider this template:
<xsl:template match="book"> <xsl:for-each select="chapter"> <xsl:text>Chapter </xsl:text> <xsl:value-of select="position()"/> <xsl:text>. </xsl:text> <xsl:value-of select="title"/> <xsl:text> </xsl:text> </xsl:for-each> <xsl:apply-templates/> </xsl:template>
It creates a table of contents from a DocBook document. The for-each element goes through the book and retrieves every child element of type chapter. This set becomes the new context node set, and within the for-each we know nothing about the old context nodes.
The first value-of element outputs the string value of the XPath expression position( ), which is the position in the set of the chapter being evaluated in this iteration through the loop. The next value-of outputs the title of this chapter. Note that it is a child of chapter, not book.
Since the output of this is plain text, I had to insert the second text element to output a newline character. (We will cover formatting and whitespace issues later in the chapter.) The result of this transformation would be something like this:
Chapter 1. Teething on Transistors: My Early Years Chapter 2. Running With the Geek Gang Chapter 3. My First White Collar Crime Chapter 4. Hacking the Pentagon
You may wonder what happens when the for-each directive fails to match any nodes. The answer is, nothing. The XSLT processor never enters the region of the element and instead just continues on with the template. There is no "or else" contingency in for-each, but you can get that functionality by using if and choose constructs covered later in the chapter.
All the template rules we have seen so far are specified by their match patterns. They are accessible only by the XSLT engine's template-matching facility. Sometimes, however, you may find it more convenient to create a named template to which you can direct processing manually.
The concept is similar to defining functions in programming. You set aside a block of code and give it a name. Later, you can reference that function and pass it data through arguments. This makes your code simpler and easier to read overall, and functions keep frequently accessed code in one place for easier maintenance. These same benefits are available in your XSLT stylesheet through named templates.
A named template is like any other template except that it has a name attribute. You can use this with a match attribute or in place of one. Its value is a name (a qualified name, to be specific) that uniquely identifies the template.
To direct processing to this template, use the directive call-template, identifying it with a name attribute. For example:
<xsl:template match="document"> <!-- regular page markup here --> <xsl:call-template name="copyright-info"/> <!-- generate a page number --> </xsl:template> <xsl:template name="copyright-info"> <p> This is some text the lawyers make us write. It appears at the bottom of every single document, ad nauseum. Blah blah, all rights reserved, blah blah blah, under penalty of eating yogurt, blah blah... </p> </xsl:template>
The first template calls the second, named template. Processing jumps over to the named template, then returns to where it left off in the first template. The context node set does not change in this jump. So even in the named template, you could check what is the current node with self::node( ) and it would be exactly the same.
Here is another example. This named template generates a menu of navigation links for an HTML page:
<xsl:template name="navbar"> <div class="navbar"> <xsl:text>Current document: </xsl:text> <xsl:value-of select="title"/> <br/> <a href="index.htm">Home</a> | <a href="help.htm">Help</a> | <a href="toc.htm">Contents</a> </div> </xsl:template>
Before the links, I placed two lines to print the current document's title demonstrating that the current node is the same as it was in the rule that invoked the named template. Since you can call a named template as many times as you want, let us put the navigation menu at the top and bottom of the page:
<xsl:template match="page"> <body> <xsl:call-template name="navbar"/> <xsl:apply-templates/> <xsl:call-template name="navbar"/> </body> </xsl:template>
If you want to change the context node set for a named template, you must enclose the call in a for-each element:
<xsl:template match="cross-reference"> <xsl:variable name="reference" select="@ref"/> <xsl:for-each select="//*[@id=$reference]"> <xsl:call-template name="generate-ref-text"/> </xsl:for-each> </xsl:template>
What this template does is handle the occurrence of a cross-reference, which is a link to another element in the same document. For example, an entry in a dictionary might have a "see also" link to another entry. For an element of type cross-reference, this template finds the value of its ref attribute and assigns it to a variable. (As we will see later on when I talk more about variables, this is a useful way of inserting a piece of text into an XPath expression.) The for-each element then locates the element whose ID matches the reference value and sets that to be the context node before passing control over to the template named generate-ref-text. That template will generate some text appropriate for the kind of cross-reference we want.
Like subroutines from programming languages, named templates can accept parameters from the templates that call them. This is a way to pass extra information to the template that it needs for processing.
For example, you may have a template that creates a highlighted node or sidebar in a formatted document. You can use a parameter to add some text to the title to set the tone: tip, caution, warning, information, and so on. Here is how that might look:
<programlisting><xsl:template match="warning"> <xsl:call-template name="generic-note"> <xsl:with-param name="label">Look out! </xsl:with-param> </xsl:call-template> </xsl:template> <xsl:template match="tip"> <xsl:call-template name="generic-note"> <xsl:with-param name="label">Useful Tip: </xsl:with-param> </xsl:call-template> </xsl:template> <xsl:template match="note"> <xsl:call-template name="generic-note"/> </xsl:template> <xsl:template name="generic-note"> <xsl:param name="label">Note: </xsl:param> <blockquote class="note"> <h3> <xsl:value-of select="$label"/> <xsl:value-of select="title"/> </h3> <xsl:apply-templates/> </blockquote> </xsl:template>
This example creates a named template called generic-note that takes one parameter, named label. Each template calling generic-note may define this parameter with the with-param element as a child of the call-template element. Or it may defer to the default value defined in the param element inside the named template, as is the case with the template matching note.
param declares the parameter in the named template. The name attribute gives it a label that you can refer to later in an attribute with a dollar sign preceding, as in the value-of element above. You may use as many parameters as you wish, but each one has to be declared.
If you use the parameter reference inside a non-XPath attribute, you need to enclose it in curly braces ({ }) to force the XSLT engine to resolve it to its text value:
<a href="{$file}">Next Page</a>
Optionally, param may assign a default value using its content. The value is a result tree fragment constructed by evaluating the content of the param element. For example, you can set it with:
<xsl:param name="label"> <span class="highlight">Note: </span> </xsl:param>
and the parameter will be set to a result tree fragment containing a span element.