Handling XML

As mentioned in the previous chapter, the Compact Framework includes a subset of the support for handling XML found in the desktop Framework within the System.Xml namespace. Particularly, this means that the Compact Framework ships with the System.Xml.Schema and System.Xml.Serialization namespaces (with the most significant omission being the XmlSerializer class), but not the System.Xml.XPath and System.Xml.Xsl namespaces. In addition, although the System.Xml.Schemas namespace is included with its XmlSchemaObject, XmlSchema, and XmlSchemaException classes, these classes are not functional and cannot be used to load, manipulate, and save XML Schema Definition Language (XSD) documents. However, even with these omissions, developers will find a wealth of functionality for reading and writing XML documents using both the DOM and stream-based readers and writers.

NOTE

Although XML Stylesheet Transformations (XSLT) are not supported, keep in mind that XSLT is particularly useful in Web programming environments, where server-side code is called upon to transform an XML document using an XSL stylesheet. Because the Compact Framework does not support ASP.NET, there is little need for XSLT.


Using the DOM

graphics/key point_icon.gif

The System.Xml namespace includes the familiar DOM programming model through its XmlDocument class, which implements the W3C DOM Level 1 Core and the Core DOM Level 2 specifications using an in-memory tree representation of the document. Many developers are already familiar with the DOM from working with the COM-based Microsoft XML Parser (MSXML), and so, using XmlDocument to manipulate local XML will often represent the smallest learning curve. In fact, the XmlDocument class is analogous to the DOMDocument class found in MSXML.

To load an XML document with the XmlDocument class, a developer can use either the LoadXml or the Load method. The former accepts a string that includes the well-formed XML to load, while the latter is overloaded to accept a filename, an XmlReader, a TextReader, or a Stream object. If the XML from the source is not well formed, an XmlException will be thrown. For example, consider an XML document that represents a score sheet for a baseball game, part of which (only the first batter and first at bat for each team, due to space limitations) is shown in Listing 3-6.

Listing 3-6 A Sample XML Document. This XML document represents a score sheet from a baseball game. Note that the Player and PA elements would repeat in a typical document.
<?xml version="1.0" encoding="utf-8"?>
<Scoresheet Visitor="Chicago Cubs" Home="San Francisco Giants"
  Date="08/06/2002" Time="9:15 PM" >
  <Visitor>
    <Lineup>
      <Player Order="1" Name="Mark Bellhorn" Position="2B"
        Inning="1" />
    </Lineup>
    <PA Inning="1" Order="1" Pitches="BBS" OutNumber="1" Out="F7" />
  </Visitor>
  <Home>
      <Lineup>
          <Player Order="1" Name="Lofton" Position="CF" Inning="1" />
      </Lineup>
      <PA Inning="1" Order="1" Pitches="BBS" LastBase="1" Result="1B" />
  </Home>
</Scoresheet>

This document can be loaded by the Scoresheet class discussed previously and parsed using the code in the LoadXmlDoc method in Listing 3-7. Note that the familiar DocumentElement and GetElementsByTagName members are present, as in other implementations of the DOM, such as MSXML. Developers then work with the individual elements in the document using the XmlNode class, which serves as the base class for XmlDocument and other classes, such as XmlAttribute, which presents attributes.

Listing 3-7 Manipulating a Document with the DOM. This method loads an XML document using the DOM. The _addPlayers method parses an entire XmlNodeList for the home and visiting teams.
Public Sub LoadXmlDoc(ByVal fileName As String)
    Dim d As New XmlDocument

    Try
        ' Load the xml file
        d.Load(fileName)

        Me.Visitor = d.DocumentElement.Attributes("Visitor").Value
        Me.Home = d.DocumentElement.Attributes("Home").Value
        Me.GameDate = d.DocumentElement.Attributes("Date").Value
        Me.GameTime = d.DocumentElement.Attributes("Time").Value

        Dim xnl As XmlNodeList

        ' Parse the visiting team
        xnl = d.GetElementsByTagName("Visitor")
        _addPlayers(xnl, Me.VisitingPlayers, Me.VisitingLine)
        ' Parse the home team
        xnl = d.GetElementsByTagName("Home")
        _addPlayers(xnl, Me.HomePlayers, Me.HomeLine)

    Catch e As XmlException
        Throw New ApplicationException("Could not load " & fileName, e)
    End Try

End Sub

In order to persist an XML document loaded into the DOM, developers can use the Save method of the XmlDocument class. This method is overloaded and allows saving to a file, a Stream, a TextWriter, or an XmlWriter (to be discussed shortly). In this way, developers have the flexibility to use an already existing stream or even to store the XML in memory using a MemoryStream for a short period.

Using XML Readers and Writers

One of the most interesting innovations supported by the desktop Framework and carried into the Compact Framework is the way developers can interact with XML documents through the use of a stream-based API analogous to the stream reading and writing performed on files. At the core of this API are the XmlReader and XmlWriter classes, which provide read-only, forward-only, cursor-style access to XML documents and a mechanism for writing out XML documents, respectively. Because these classes implement a stream-based approach, they do not require that the XML document be parsed into a tree structure and cached in memory as happens when working with the document through the XmlDocument class.

Using XML Readers

Obviously, the DOM programming model is not ideal for all applications, particularly when the XML document is large. Any but the smallest XML documents have both the effect of slowing performance because of having to build the DOM tree and consuming additional memory to store the tree. On CPU and memory-constrained devices like those running the Compact Framework, this is especially important to consider.[6]

[6] To address these issues on desktop PCs, Microsoft included Simple API for XML (SAX) in MSXML 3.0 to provide an event-driven programming model for XML documents. Although this alleviated the performance and memory constraints of the DOM, it did so at the cost of complexity.

graphics/key point_icon.gif

The XmlReader is designed to alleviate these constraints by combining the best aspects of the DOM and the event-based Simple API for XML (SAX) API in MSXML in the context of a stream-based architecture. In this model developers pull data from the document using an intuitive cursor-style looping construct, rather than simply being pushed data by responding to events fired from the parser or querying an already existing tree structure.

The XmlReader class is actually an abstract base class for the XmlTextReader, and XmlNodeReader classes and is often used polymorphically as the input or output arguments for other methods in the Compact Framework. An example of using the XmlTextReader to parse the XML document shown in Listing 3-6 is produced in Listing 3-8. Notice that this listing is functionally identical to Listing 3-7.

Listing 3-8 Manipulating a Document with an XmlReader. This method loads the same XML document as in Listing 3-6, but this time using an XmlTextReader.
Public Sub LoadXmlReader(ByVal fileName As String)

    Dim xlr As XmlTextReader

    Try
       xlr = New XmlTextReader(fileName)
       xlr.WhitespaceHandling = WhitespaceHandling.None

       Do While xlr.Read()
           Select Case xlr.Name
               Case "Scoresheet"
                   If xlr.IsStartElement Then
                       Me.Home = xlr.GetAttribute("Home")
                       Me.Visitor = xlr.GetAttribute("Visitor")
                       Me.GameTime = xlr.GetAttribute("GameTime")
                       Me.GameDate = xlr.GetAttribute("GameDate")
                   End If
               Case "Visitor"
                   If xlr.IsStartElement Then
                       _addPlayersReader(xlr, Me.VisitingPlayers, _
                        Me.VisitingLine)
                   End If
               Case "Home"
                   If xlr.IsStartElement Then
                       _addPlayersReader(xlr, Me.HomePlayers, _
                        Me.HomeLine)
                   End If
           End Select

      Loop
    Catch e As XmlException
        Throw New ApplicationException("Could not load " & fileName, e)
   Finally
       xlr.Close()
   End Try

End Sub

graphics/key point_icon.gif

Listing 3-8 is contrasted to Listing 3-7 in that the document is parsed piecemeal, using a Do loop and a Select Case statement, rather than in a single shot using the Load method.[7] This approach has two consequences. First, because the document is being processed incrementally, if the document is not well formed, an XmlException will not be thrown until the offending element is reached. Second, even for small XML documents, the XmlReader is faster than using the DOM. In fact, even for score sheet documents like these ranging from 4K to 8K in size, the difference is easily measurable with the XmlReader being more than 25% faster.

[7] In fact, behind the scenes, the XmlDocument class uses an XmlReader to parse and load the document into the in-memory tree.

Although this point is not made in Chapter 2, the Compact Framework does not support the XmlValidatingReader class that derives from XmlReader in the desktop Framework where it can be used to validate an XML document against a document type definition (DTD), XML-Data Reduced (XDR), or XSD document.

Using XML Writers

The Compact Framework also provides streamed access for writing XML documents by including the XmlWriter class. As with XmlReader, the XmlWriter class is the base class, whereas developers typically work with the XmlTextWriter derived class.

Basically, the XmlTextWriter includes properties that allow for the control of the XML formatting and namespace usage, methods analogous to other stream writers discussed previously, such as Flush and Close, and a bevy of Write methods that add text to the output stream. An example of writing an XML document using the XmlTextWriter is shown in Listing 3-9.

Listing 3-9 Writing a Document with an XmlTextWriter. This method writes out an XML document representing the box score stored by the Scoresheet class.
Public Sub WriteXml(ByVal fileName As String)

    Dim fs As FileStream
    Dim tw As XmlTextWriter

    Try
        tw = New XmlTextWriter(fileName, New System.Text.UTF8Encoding)

        tw.Formatting = Formatting.Indented
        tw.Indentation = 4

        ' Write out the header information
        tw.WriteStartDocument()
        tw.WriteComment("Produced on " & Now.ToShortDateString())

        tw.WriteStartElement("BoxScore")
        tw.WriteAttributeString("Visitor", Me.Visitor)
        tw.WriteAttributeString("Home", Me.Home)
        tw.WriteAttributeString("Date", Me.GameDate)
        tw.WriteAttributeString("Time", Me.GameTime)

        ' Visiting team
        tw.WriteStartElement("Visitor")

        Dim p As PlayerLine
        For Each p In Me.VisitingPlayers
            tw.WriteStartElement("Player")
            tw.WriteAttributeString("Order", p.Order)
            tw.WriteAttributeString("Name", p.Name)
            tw.WriteAttributeString("Position", p.Pos)
            tw.WriteAttributeString("Inning", p.Inning)
            tw.WriteElementString("AB", p.AB)
            tw.WriteElementString("H", p.H)
            ' Other properties here
            tw.WriteEndElement() ' Finish player
        Next

        tw.WriteEndElement() ' Finish visitor

        ' Do the same for the home team

        tw.WriteEndDocument() ' Finish off the document

    Catch e As XmlException
        Throw New ApplicationException("Could not write " & fileName, e)
    Finally
        tw.Close()
    End Try
End Sub

You'll notice in Listing 3-9 that the various write methods such as WriteStartDocument, WriteStartElement, and WriteAttributeString are used to write the XML and that the XmlWriter is smart enough to close all the open elements with a single call to WriteEndDocument. A portion of the resulting XML file follows:


<?xml version="1.0" encoding="utf-8"?>
<!-- Produced on 11/9/2002 -->
<BoxScore Visitor="Chicago Cubs" Home="San Francisco Giants"
  Date="08/06/2002" Time="9:15 PM" >
  <Visitor>
      <Player Order="1" Name="Mark Bellhorn" Position="2B"
        Inning="1">
        <AB>4</AB>
        <H>2</H>
      </Player>
  </Visitor>
</BoxScore>