Using the Document Object Model (DOM)

Using the Document Object Model (DOM)

XmlTextReader and XmlTextWriter are great classes to use when you want fast, forward-only reading and writing; however, they’re not perfect for every situation. You’ll also find times when you want to deal with XML data using the Document Object Model (DOM).

The DOM allows you to load the structure of an XML document into memory. By loading the structure, you gain the ability to perform updates, insertions, and deletions within the XML document. Unfortunately, this gain comes at the cost of scalability. Because the XmlTextReader and XmlTextWriter classes read only part of the XML file into memory at a time, they’re much more scalable.

The DOM is composed of the different items within an XML file. Each of these items is considered a node in the structure, as shown in Figure 19-3.

Figure 19-3.
The DOM for part of the Videos.xml file.

Figure 19-3 illustrates the nodes from the following XML file. This XML file is like the Videos.xml file created earlier in this chapter, in the section “Sample Application Using XmlTextWriter.”

<?xml version="1.0"?>
<VideoLibrary>
   <Video>
      <Title>Gone with the Wind</Title>
      <Length Measurement="minutes">111</Length>
      <star>Clark Gable</star>
      <rating>PG</rating>
   </Video>
   <Video>
      <Title>Winnie the Pooh</Title>
      <Length Measurement="minutes">93</Length>
      <star>Christopher Robin</star>
      <rating>G</rating>
   </Video>
</VideoLibrary>

As you can see, the DOM has a tree structure. Each item in the file is a node in the tree, with subnodes attached below. The data is represented as nodes too and is separate from the elements. Using DOM-related classes within the .NET Framework, you can gain access to any of these nodes.

Working with XML Documents

When you use the DOM, you’ll have the ability to read and write as well as navigate through an XML file. A number of classes can be used to access the DOM from Visual C# .NET and the .NET Framework. These classes are listed in Table 19-3.

Table 19-3. Classes in the .NET Framework That Access the DOM 

Class

Description

XmlNode

Used to create objects that can hold a single node of an XML document.

XmlDocument

Used to hold an entire XML document object. Allows document navigation as well as editing.

XmlDocumentFragment

Used to hold a fragment of XML. This XML fragment can be inserted into a document or used for other purposes.

XmlElement

Used to work with element type nodes within an XML document.

XmlNodeList

Represents an ordered collection of nodes within an XML document.

XmlNamedNodeMap

Used to access a collection of nodes either by name or by an index value.

XmlAttribute

Used to work with an attribute type node within an XML document.

XmlCDataSection

Used to work with a CDATA section.

XmlText

Used to hold the text content of an element or attribute.

XmlComment

Used to work with comments.

XmlDocumentType

Used to hold information associated with the document type declaration.

XmlElement

Used to hold an XML document element.

XmlEntity

Used to hold an entity.

XmlEntityReference

Used to hold an entity reference node.

XmlNotation

Used to hold the notation declared within a Document Type Definition (DTD).

XmlProcessingInstruction

Used to hold a processing instruction.

The .NET Framework provides the XmlDocument class as the starting point for working with basic XML documents. This class will load the XML document into memory so that you can manipulate it.

Loading an XML Document into the DOM

To load an XML document into memory, you can use the Load method of the XmlDocument class, as shown here:

XmlDocument myDoc = new XmlDocument();
myDoc.Load(@"C:\Videos.xml");

The first line of code declares an XmlDocument object named myDoc. The file, Videos.xml, is then loaded into this XMLDocument object. Once loaded, the information can be viewed, updated, and saved.

The Load method has been overloaded so that you can also load from other sources. Possible sources include an XmlReader object, a TextReader object, a stream, and a URL or a file name.

In addition to the Load method, there’s also a LoadXML method. The LoadXML method receives a string as a parameter. This string contains formatted XML that will be placed into the XmlDocument object. It’s important to note that white space is ignored and DTD or schema information isn’t validated. If these elements are important to you, you’ll want to use the Load method.

Working with Document Data

Once the XML document has been loaded into memory, you can begin working with the information stored in the document. Because the entire file is in memory, you can navigate and manipulate the information in any manner you choose. A number of classes (listed earlier in this chapter, in Table 19-3) and methods are available to make this easy for you.

It’s important to understand how the XML document is organized. Figure 19-3 earlier in this chapter showed the node structure for an XML document. In Figure 19-4, you see a portion of the same figure with additional information: references to some of the nodes.

Figure 19-4.
Node access in an XML document.

Access to all of the elements is through the XmlDocument object you create. Within the XmlDocument object, you can access the root element using the DocumentElement property. From there, you can access each of the nodes and subnodes using ChildNodes arrays. Although the root element is accessed differently than the child nodes (using the DocumentElement property instead of the ChildNodes array), the properties and methods of both are virtually the same because both are types of XmlNode objects.

Notice also that nodes positioned at the same level are peers to each other. Peers can be accessed in the same manner—using the ChildNodes array. The difference is that each contains a different subscript. If a child node element has subnodes, use another ChildNodes array to access the subnodes, continuing for as many levels as there are subnodes. Once you’ve reached the level of the node you’re interested in using, you can use other methods and properties to access the node’s information.

Saving a File from the DOM

After you’ve made changes to a file, you might want to save them. Remember, an XmlDocument object is stored in memory. When you make changes, they’re made in memory, not in the file. You must save the XmlDocument object if you want to keep the changes. The Save method is used to update the file. In addition to assigning a file name or a URL, you can also save to an XmlWriter object, a TextWriter object, or a stream.

Sample Application Using the DOM

The following code is a program that loops through the first node of the Videos.xml file you created earlier in this chapter, in the section “Sample Application Using XmlTextWriter.” It allows you to update the data, which is then saved back to a new file (NewVideos.xml).

using System;
using System.Drawing;
using System.Collections;
using System.ComponentModel;
using System.Windows.Forms;
using System.Data;
using System.Xml;
using System.Text;

namespace ReadDataSet
{
    public class frmReadXML : System.Windows.Forms.Form
    {
        private System.Windows.Forms.Label label1;
        private System.Windows.Forms.Label label2;
        private System.Windows.Forms.TextBox txtValue;
        private System.Windows.Forms.TextBox txtName;
        private System.Windows.Forms.Button btnExit;
        private System.Windows.Forms.Button btnNext;
        /// <summary>
        /// Required designer variable.
        /// </summary>
        private System.ComponentModel.Container components = null;
        private System.Windows.Forms.Label txtDisplay;
        // Counter to loop through elements
        private int ctr = 0;
        private System.Windows.Forms.Button btnUpdate; 

        XmlDocument myDoc = null;  // For XmlDocument object

        public frmReadXML()
        {
            //
            // Required for Windows Form Designer support
            //
            InitializeComponent();

            // TODO: Add any constructor code 
            // after InitializeComponent call.
            
            // Allocate XmlDocument object.
            myDoc = new XmlDocument();

            // Load XML data.
            myDoc.Load(@"C:\Videos.xml");
        }

        /// <summary>
        /// Clean up any resources being used.
        /// </summary>
        protected override void Dispose( bool disposing )
        {
            if( disposing )
            {
               if (components != null) 
               {
                   components.Dispose();
               }
            }
            base.Dispose( disposing );
        }

        #region Windows Form Designer generated code
        /// <summary>
        /// Required method for Designer support - do not modify
        /// the contents of this method with the code editor.
        /// </summary>
        private void InitializeComponent()
        {
            
    

        }
        #endregion

        /// <summary>
        /// The main entry point for the application.
        /// </summary>
        [STAThread]
        static void Main() 
        {
            Application.Run(new frmReadXML());
        }

        private void btnExit_Click(object sender, System.EventArgs e)
        {
            myDoc.Save(@"C:\newVideos.xml");
            Application.Exit();
        }

        private void btnNext_Click(object sender, System.EventArgs e)
        {    
            XmlNode aNode = myDoc.DocumentElement
                .ChildNodes[0].ChildNodes[ctr];

            if (++ctr >= myDoc.DocumentElement
                .ChildNodes[0].ChildNodes.Count)
                ctr = 0;

            txtName.Text    = aNode.Name;
            txtValue.Text   = aNode.ChildNodes[0].Value;
            txtDisplay.Text = aNode.InnerText;
        }

        private void btnUpdate_Click(object sender, 
            System.EventArgs e)
        {
            int tmp = 0;
        // Need the previous value of the ctr variable.
            if (ctr == 0) 
        {
            // Set tmp to the last element in the array (count – 1).
            tmp = myDoc.DocumentElement
                .ChildNodes[0].ChildNodes.Count - 1;
        }
        else 
            tmp = ctr - 1;

        // If Name of node is blank, skip update.
        // Name of node will be blank when program starts.
            if( myDoc.DocumentElement
                .ChildNodes[0].ChildNodes[tmp].Name != "" )
            {
                myDoc.DocumentElement.ChildNodes[0]
                    .ChildNodes[tmp].ChildNodes[0].Value = 
                    txtValue.Text;
                txtDisplay.Text = "Updated: " 
                    + myDoc.DocumentElement.ChildNodes[0]
                    .ChildNodes[tmp].ChildNodes[0].Value;
            }
        }
    }
}

This listing displays the form shown in Figure 19-5.

Figure 19-5.
Sample XML DOM application.

Keep in mind that this application isn’t particularly practical. It does illustrate some of the basic uses of XmlDocument objects as well as some of the basic navigation techniques, however. As you can see, the Load method is used to open the Videos.xml file from the root of the C drive. You can also see in the btnExit_Click event that the Save method is used to write the resulting XML to a file named NewVideos.xml, also on the root of C.

In the btnNext_Click event, an XmlNode object is created and assigned to a node within the document, as follows:

XmlNode aNode = myDoc.DocumentElement.ChildNodes[0].ChildNodes[ctr];

Within the myDoc object are the subnodes, or child nodes. The first subnode, ChildNodes[0], is the first Video element. The first Video element contains the subnodes for each of the items within a video—in this example, there are four items: title, length, star, and rating. Using a counter, the ctr variable, you can loop through each of these items. For each of these subelements, you can then display a name and a value. Within the btnNext_Click event, the use of the XmlNode object makes the remaining code much simpler. You can easily work with the methods and properties. In the btnUpdate_Click event handler, instead of an XmlNode object, the values in the XmlDocument object, myDoc, are used. As you can see, both methods work—which you decide to use depends on your personal preferences.

A number of additional properties, methods, and events are used with the classes listed earlier in Table 19-3. Check the online help for additional information.



Part III: Programming Windows Forms