Hack 34 Analyze Nodes with TreeViewer

figs/moderate.gif figs/hack34.gif

View nodes in an XML document according to the XPath 1.0 data model.

The XPath 1.0 data model (http://www.w3.org/TR/xpath#data-model) views XML documents as containing seven possible node types:

  • Root nodes (called the document nodes in the XPath 2.0 data model; see http://www.w3.org/TR/xpath-datamodel/#DocumentNode)

  • Element nodes

  • Attribute nodes

  • Text nodes

  • Comment nodes

  • Processing instruction nodes

  • Namespace nodes

Mike Brown and Jeni Tennison have created several stylesheets, available at http://skew.org/xml/stylesheets/treeview/, that visually represent all seven of the XPath node types. Such tools can be useful when trying to uncover less obvious nodes (namespace or whitespace-only text nodes) or when just learning about the XPath model. These stylesheets allow you to view an XML tree either in ASCII (ascii-treeview.xsl) or in HTML (tree-view.xml with tree-view.css). All three are available in the working directory where you extracted the file archive for the book.

When you apply ascii-treeview.xsl to time.xml using an XSLT processor such as Xalan by using this command:

xalan time.xml ascii-treeview.xsl

it will produce the text tree view of time.xml shown in Example 3-5.

Example 3-5. Output from ascii-treeview.xsl
root

  |_ _ _comment ' a time instant '

  |_ _ _element 'time'

        |  \_ _ _attribute 'timezone' = 'PST'

        |_ _ _text '\n '

        |_ _ _element 'hour'

        |     |_ _ _text '11'

        |_ _ _text '\n '

        |_ _ _element 'minute'

        |     |_ _ _text '59'

        |_ _ _text '\n '

        |_ _ _element 'second'

        |     |_ _ _text '59'

        |_ _ _text '\n '

        |_ _ _element 'meridiem'

        |     |_ _ _text 'p.m.'

        |_ _ _text '\n '

        |_ _ _element 'atomic'

        |        \_ _ _attribute 'signal' = 'true'

        |_ _ _text '\n'

In the result, each of the nodes in time.xml has a label: root, comment, element, attribute, or text. You can even see where the whitespace text nodes are (\n).

By default, namespace nodes are not shown. You can show namespace nodes with the show_ns parameter. Parameters are values that you can pass into stylesheets or templates at run time. These values can change the outcome of a transformation.

Now we'll expose a tree view of namespace.xml. To see the namespace nodes, pass the show_ns parameter into ascii-treeview.xsl using the -p switch with Xalan, as shown here:

saxon namespace.xml ascii-treeview.xsl show_ns=yes

Figure 3-10 shows the result; notice the ns and namespace labels.

Figure 3-10. Output from ascii-treeview.xsl showing namespace nodes
figs/xmlh_0310.gif


The document tree.xml contains a processing instruction and has only namespace-qualified, prefixed elements (Example 3-6).

Example 3-6. tree.xml
<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet href="tree-view.xsl" type="text/xsl"?>

   

<!-- a time instant -->

<tz:time timezone="PST" xmlns:tz="http://www.wyeast.net/time">

 <tz:hour>11</tz:hour>

 <tz:minute>59</tz:minute>

 <tz:second>59</tz:second>

 <tz:meridiem>p.m.</tz:meridiem>

 <tz:atomic signal="true"/>

</tz:time>

The XML stylesheet PI near the top of the document refers to the tree-view.xsl stylesheet, which produces HTML using CSS (tree-view.css). To apply tree-view.xsl to tree.xml, open tree.xml in a browser that supports client-side XSLT transformations, such as IE, Mozilla, Firefox, or Netscape. Figure 3-11 shows a portion of the tree view of tree.xml in the Netscape browser. Each of the node labels uses a different background color, and namespace names are enclosed in braces. The names of elements and attributes use a white background.

Figure 3-11. tree.xml transformed by tree-view.xsl and styled by tree-view.css
figs/xmlh_0311.gif