Before delving into the details of this much-awaited feature, you need to learn some basic information about XML (and SQL Server 2000?generated XML in particular).
Because most developers will exploit SQL Server 2000's XML capabilities for Web projects, this chapter assumes that you are familiar with the basics of Hypertext Markup Language (HTML). If not, we recommend Sams Teach Yourself HTML and XHTML in 24 Hours (0-672-32076-2).
Like HTML, XML is character data consisting of elements?alphanumeric names delimited by < and >?and attributes?name-value pairs of the form name="value"?that reside inside elements, separated from other attributes by whitespace.
The big difference between HTML and XML is that the HTML elements specify display information (how big a font should be for a given element, how bold, and so on), and XML elements do not. XML provides only a specification for how data should be structured and contains no inherent display information. It is up to the object that uses the XML to determine whether and how the data should be displayed.
In XML, elements and attributes both contain only character data. Following is an example of a single element that has an attribute and a value:
Unlike HTML, no set list of predefined XML tags exists?the tags can be anything you want, provided that they adhere to a few key rules. For XML documents to be considered well formed (that is, suitable for an XML parser to process without error), they must meet the following requirements:
Every element or tag must have a corresponding end tag. For example: <element>DATA</element> or even <element></element>. If an element has no content (known as having empty content), it might also terminate in the following character sequence: />, such as <element/>.
XML parsers are case sensitive with respect to element and attribute names. When running your XML data through a parser after you get it back from a SQL Server 2000 query, be sure to remember that in the mind of the parser, <Myelement> is not the same as <myelement>, nor is Myattribute the same as myattribute.
XML documents must have only one root element containing all other elements. Depending on the type of query you design, SQL Server 2000 might return a document fragment. These fragments are XML documents that are well formed except that they lack a top-level (or root) element. To illustrate this, test the following URL in IE: http://<myserver>/NorthwindVdir?sql=SELECT TOP 2 CustomerID FROM Customers FOR XML RAW.
The error message that results is shown in Figure 41.8.
IE uses an XML parser known as MSXML to display XML documents. The error message it displays (XML document must have a top level element) tells you that your XML is not well formed. This is because the query results in a document fragment. It has two <row> elements but no root element enclosing them both. (If you had selected a single row, however, the results would be well formed.) To render this XML as well formed, add the string &root=test2XML to the end of the URL to get a root element called test2XML around the resulting rows (see Figure 41.9).
Elements can't overlap. The following is an example of overlapping elements that is legal in HTML but not XML:
<h1>This Header <h2>has <h3>overlapping </h2> </h3>elements</h1>.
The good news: SQL Server will never create XML that looks like this.
XML has naming conventions for elements and attributes. Most of the time you can opt out of naming the XML elements that are returned from SQL Server?the element and attribute names will match the table and column names being selected.
Aliasing these names, however, provides an opportunity to change them. The good news, once again: SQL Server will never create XML that violates the following XML naming conventions:
Element or attribute names should not contain whitespace, the word xml, or a colon (:).
Element or attributes must begin with letters or underscores (_), but not digits or any other punctuation.
Numbers are permitted in names after the first character, as well as hyphens (-) and periods (.).
If your table, column names, or aliases do contain unsupported XML characters, you can still retrieve data from them using the FOR XML clause. SQL Server 2000 will permit you to select these objects (Order Details, for example), but it will convert any offending characters (in this case a single space) into a string representation of the hexadecimal equivalent of the character, resulting in element names such as <Order_x0020_Details>.