Hack 57 Grouping in XSLT 1.0 and 2.0

figs/moderate.gif figs/hack57.gif

If your nodes are out of sorts in your source, use grouping to bring them into line.

This hack shows you several techniques for grouping nodes in the output of an XSLT processor. The first uses XSLT 1.0 and the Muenchian method, named after Steve Muench (see http://www.oreillynet.com/pub/au/609). The second uses XSLT 2.0, which is simpler than the XSLT 1.0 method.

3.28.1 Grouping with XSLT 1.0

The problem that grouping solves is that nodes may not be grouped to your liking in the source document. For example, look at group.xml (Example 3-54).

Example 3-54. group.xml
<?xml version="1.0" encoding="US-ASCII"?>

<?xml-stylesheet href="group.xsl" type="text/xsl"?>




  <uscity state="Nevada">Las Vegas</uscity>

  <uscity state="Arizona">Phoenix</uscity>

  <uscity state="California">San Francisco</uscity>

  <uscity state="Nevada">Silver City</uscity>

  <uscity state="Washington">Seattle</uscity>

  <uscity state="Montana">Missoula</uscity>

  <uscity state="Washington">Spokane</uscity>

  <uscity state="California">Los Angeles</uscity>

  <uscity state="Utah">Salt Lake City</uscity>

  <uscity state="California">Sacramento</uscity>

  <uscity state="Idaho">Boise</uscity>

  <uscity state="Montana">Butte</uscity>

  <uscity state="Washington">Tacoma</uscity>

  <uscity state="Montana">Helena</uscity>

  <uscity state="Oregon">Portland</uscity>

  <uscity state="Nevada">Reno</uscity>

  <uscity state="Oregon">Salem</uscity>

  <uscity state="Oregon">Eugene</uscity>

  <uscity state="Utah">Provo</uscity>

  <uscity state="Idaho">Twin Falls</uscity>

  <uscity state="Utah">Ogden</uscity>

  <uscity state="Arizona">Flagstaff</uscity>

  <uscity state="Idaho">Idaho Falls</uscity>

  <uscity state="Arizona">Tucson</uscity>



The uscity nodes in group.xml list western United States cities at random, not in an organized way as you might prefer. One feature that can help is that each uscity node has a state attribute. The XSLT grouping technique I'll show you can organize the output according to state, also listing each appropriate city with the given state. This grouping technique is popularly known as the Muenchian method.

The Muenchian method of grouping employs keys together with the generate-id() function. There are other grouping methods in XSLT, such as one that uses the preceding-sibling axis, but I've chosen to show you only the Muenchian method here for two reasons. First, it is the most efficient and fastest method of grouping; second, it is the most similar to the new grouping method using the XSLT 2.0 element for-each-group, which you will see in the next section.

The stylesheet group.xsl is shown in Example 3-55. It produces HTML output and assembles its output according to the Muenchian method.

Example 3-55. group.xsl
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="html"/>

<xsl:key name="list" match="uscity" use="@state"/>


<xsl:template match="/">



<title>Western State Cities</title></head>

<style type="text/css">

h2 {font-family:verdana,helvetica,sans-serif;font-size:13pt}

li {font-family:verdana,helvetica,sans-serif;font-size:11pt}



<xsl:for-each select="/uscities/western/uscity[generate-id(.)=generate-id(key('list', 



<h2><xsl:value-of select="."/></h2>


 <xsl:for-each select="key('list', .)">


  <li><xsl:value-of select="."/></li>








The secret to understanding the Muenchian method lies in its use of keys and the generate-id() function. On line 1 of group.xsl, the key named list is defined. This key is used to efficiently find state attributes on uscity elements in group.xml.

Without getting too overwrought, the generate-id() function is used with the key() function in for-each to process the first node in a set (line 14). In this example, it finds the first node whose state attribute identifies a given state, and outputs the name of the state found in the attribute.

Following that, another for-each (line 18) processes each other node in the document matching the previous for-each, also using key(). The value-of (line 20) under this for-each outputs the name of the given city. The sort elements (lines 15 and 19) under the for-each elements sort the nodes in alphabetical order.

It's a little complicated, but it works well. Test it with Xalan C++ [Hack #32] :

xalan -m -i 1 -o group.html group.xml group.xsl

You will get nicely grouped HTML output, stored in group.html (Example 3-56).

Example 3-56. group.html


  <title>Western State Cities</title>


 <style type="text/css">

h2 {font-family:verdana,helvetica,sans-serif;font-size:13pt}

li {font-family:verdana,helvetica,sans-serif;font-size:11pt}











   <li>Los Angeles</li>


   <li>San Francisco</li>





   <li>Idaho Falls</li>

   <li>Twin Falls</li>










   <li>Las Vegas</li>


   <li>Silver City</li>












   <li>Salt Lake City</li>










In the output, under each alphabetically listed state comes an alphabetical list of cities. That's what grouping can do for you. Figure 3-29 shows group.html in the Opera browser. Because they support client-side XSLT, you can also open group.xml in Mozilla, Firefox, Netscape, or IE, which will render the document according to group.xsl. Opera does not support client-side XSLT.

Figure 3-29. group.html in Opera

3.28.2 Grouping with XSLT 2.0

The design behind grouping in XSLT 2.0 probably grew out of experience with grouping in Version 1.0. Grouping in XSLT 1.0 usually brings the for-each instruction element into service. XSLT 2.0 has a new instruction element called for-each-group that makes grouping a relative snap.

Glance at group2.xml, which lumps XPath 2.0's context-related functions into two piles by labeling them with a type attribute (Example 3-57).

Example 3-57. group2.xml
<?xml version="1.0"?>



 <description>XPath 2.0 Context Functions</description>



 <function type="new">current-date( )</function>

 <function type="new">current-dateTime( )</function>

 <function type="new">current-time( )</function>

 <function type="new">default-collation( )</function>

 <function type="new">implicit-timezone( )</function>

 <function type="legacy">last( )</function>

 <function type="legacy">position( )</function>


The eight functions in this list are either legacy functions or new ones. The group2.xsl stylesheet (Example 3-58) groups the functions in group2.xml according to the content of the type attribute.

Example 3-58. group2.xsl
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="list">


 <xsl:for-each-group select="function" group-by="@type">

  <functions type="{@type}">

   <xsl:value-of select="current-group( )" separator=", "/>






The for-each-group function (line 6) selects the sequence (node-set in XSLT 1.0) to group with the select attribute; i.e., all function children of list. The group-by attribute determines the key for grouping, which in this case is the content of the type attribute in the source. The functions literal result element uses an attribute value template to reflect the value of the type attribute.

On line 8, the value-of element's select attribute uses the current-group( ) function?also a new kid on the block in XSLT 2.0?to keep track of which group is which. The separator attribute is a new addition to XSLT 2.0, too. It tells the XSLT 2.0 processor to write a comma followed by a space after each found node is sent to the result tree.

In XSLT 1.0, value-of outputs only the first node of a returned node-set in string form; in XSLT 2.0, all nodes in a sequence can be returned, so you have to plan accordingly.

You might guess that for-each-group has several other attributes, which it does: group-adjacent, group-starting-with, group-ending-with, and collation. I'm not going to cover them here, but you can read more about for-each-group and its attributes in Section 14 of the XSLT 2.0 specification.

To get this example to work, you need the latest version of Saxon (currently 8.0, which supports XSLT 2.0 [http://saxon.sourceforge.net]). Use this command to transform group2.xml:

java -jar saxon8.jar group2.xml group2.xsl

The result is two lists of functions, grouped and comma-separated, in functions elements (Example 3-59).

Example 3-59. Output of group2.xsl
<?xml version="1.0" encoding="UTF-8"?>


   <functions type="new">current-date( ), current-dateTime( ), current-time( ), 

default-collation( ), implicit-timezone( )</functions>

   <functions type="legacy">last( ), position( )</functions>


This example should give you a feel of how to group nodes in XSLT 2.0.

3.28.3 See Also

  • Learning XSLT, by Michael Fitzgerald (O'Reilly), pages 206-209 and 286-288