Retrieving XML-Data Schemas

An important feature of SQL Server 2000 is the ability to retrieve XML-formatted metadata that defines the content model (what elements will be present, their nesting structure, and what types of data they contain) of an XML document.

This metadata comes in the form of a well-formed XML document known as an XML-Data schema. It can be returned in queries that use any of the three FOR XML modes, and to get it, you specify the XMLDATA option, as exemplified in Listing 41.6.

Listing 41.6 Using the XMLDATA Option with FOR XML AUTO to Return an XML-Data Schema
SELECT TOP 2 OrderID, OrderDate, CustomerID
FROM Orders


<Schema name="Schema1" xmlns="urn:schemas-microsoft-com:xml-data"
        <ElementType name="Orders" content="empty" model="closed">
               <AttributeType name="OrderID" dt:type="i4"/>
               <AttributeType name="OrderDate" dt:type="dateTime"/>
               <AttributeType name="CustomerID" dt:type="string"/>
               <attribute type="OrderID"/>
               <attribute type="OrderDate"/>
               <attribute type="CustomerID"/>
<Orders xmlns="x-schema:#Schema1" OrderID="10248"
        OrderDate="1996-07-04T00:00:00" CustomerID="VINET"/>
<Orders xmlns="x-schema:#Schema1" OrderID="10249"
        OrderDate="1996-07-05T00:00:00" CustomerID="TOMSP"/>

First note that the schema is always output directly on top of your XML results. Schema is always its root element, and its name attribute has a special function: It declares the document as a namespace. When a namespace is used, elements in other XML documents might contain the elements defined in this schema by specifying the name of the schema as the value of their xmlns (XML Namespace) attribute.

Orders elements, for example, are linked to Schema1 by way of their xmlns attribute. The value of xmlns (preceded by "x-schema:") points back to the schema as a way of indicating that the metadata in the schema applies to Orders elements. The # sign indicates that the schema is inline (it works just like the # sign does in HTML links) or contained within the XML document it describes. (Note also that schemas themselves refer to a Microsoft namespace in their xmlns attribute.)

The name attribute will always have a value of Schema followed by an integer. This integer is incremented automatically by SQL Server after every query generated during the same session to prevent what is known as a namespace collision?when two XML documents declare the same namespace. It's necessary to rename the schema in this way because it differentiates one schema from any other that might have been produced by a query executed during the same SQL Server session.

The structure of the schema provides useful information about the XML. The values of its elements and attributes will differ depending on the mode and options you specify in the FOR XML clause. The elements that will be present (as of this writing?please note that the specification for XML Schemas is a work in progress) are as follows:

  • ElementType?For every XML element, an ElementType element that defines it is produced. It has the following attributes:

    [View full width]
    <ElementType content="{empty | textOnly | eltOnly | mixed}" dt:type="data type graphics/ccc.gif" model="{open | closed}" name="idref" order="{one | seq | many}">

    The most useful attribute of ElementType is dt:type. It tells any code you use to process the schema what kind of data the element named in its name attribute contains. When you convert the schema using an XML stylesheet, for example, it is far easier to generically parse XML elements based on the value of dt:type than by testing the element's value.

    The value of dt:type is a string representation of the XML datatype to which the SQL Server datatype of the selected column corresponds. The most common are dateTime?corresponding to datetime, i4?a four-byte integer corresponding to int, and string?corresponding to varchar. See the MSDN Online topic titled "XML Data Types" for more information.

    The content attribute is also of interest. It specifies how the XML for the named element is formed?whether it is empty (contains no data), textOnly (contains only data but no child elements), eltOnly (contains elements only) or mixed (contains both data and child elements).

  • AttributeType? ElementType elements contain these elements. They specify the name and type of any attributes that the element specified in its name attribute have.

  • attribute? ElementType elements contain these elements. They define an element's attributes and refer back to AttributeType via their type attribute.

Knowing these things about your XML results before parsing them enables you to write generic processing code that is far more likely to be reused than code that is purely data-specific.

    Part III: SQL Server Administration
    Part IV: Transact-SQL
    Part V: SQL Server Internals and Performance Tuning
    Part VI: Additional SQL Server Features