5.4 RDF Schema Alternatives

RDF isn't the only specification related to describing schemas. XML documents (and their SGML predecessors) have long been validated through the use of Document Type Declarations (DTDs), described in the first release of the XML specification and still in heavy use. DTDs generally define how elements relate to one another within a schema; for example, they allow applications to check whether a specific element is required or one or more elements can be contained within another.

While useful for validating how elements within a schema relate to one another, DTDs have long had their critics. First of all, DTDs are based on a syntax totally unrelated to XML. This forces a person to become familiar with not one but two syntaxes in order to create a valid as well as well-formed XML document. The following DTD fragment defines an Items element, its child item, and the contents of item:

<!ELEMENT Items (item*)>
<!ELEMENT item (productName, quantity, USPrice, comment?, shipDate?)>
<!ATTLIST item 
<!ELEMENT productName (#PCDATA)>
<!ELEMENT quantity (#PCDATA)>
<!ELEMENT comment (#PCDATA)>
<!ELEMENT shipDate (#PCDATA)>

As you can see, the DTD syntax is fairly intuitive; however, syntactic elegance or not, DTDs do not provide the same type of functionality as the RDF specification. XML DTDs define how elements within a vocabulary relate to one another, not how they relate to the world at large, and the description of their contents is pretty vague. #PCDATA and its attribute cousin, CDATA, just mean "text." RDF provides a means of recording data within a global context, not just how elements in one specific vocabulary relate to one another.

Another mechanism to record schemas is defined by the W3C XML Schema 1.0 Specification. This specification is more closely related to the functionality used to define a relational table or to describe an object in object-oriented development. Schemas are used to define elements in relation to one another, as with the DTD syntax; it goes beyond DTDs, though, by providing a means of recording data types about the elements and attributes?a functionality long needed with XML vocabularies, as shown in the following fragment based on the specification:

<xsd:element name="Items">
 <xsd:complexType name="Items">
   <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
      <xsd:element name="productName" type="xsd:string"/>
      <xsd:element name="quantity">
        <xsd:restriction base="xsd:positiveInteger">
         <xsd:maxExclusive value="100"/>
      <xsd:element name="USPrice"  type="xsd:decimal"/>
      <xsd:element ref="comment"   minOccurs="0"/>
      <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
     <xsd:attribute name="partNum" type="SKU" use="required"/>

As you can see, W3C XML Schema is an effective specification for defining XML elements, their relationships, and much more information about associated data types than DTDs provide.

A third approach, RELAX NG Compact Syntax, offers a combination of DTD readability and W3C XML Schema data typing, though it also has a mathematical foundation that in some ways has more in common with RDF than with DTDs or W3C XML Schema. The same example in RELAX NG Compact Syntax looks like:

Items = element Items { item* }
item =
  element item {
    att.partNum, productName, quantity, USPrice, comment?, shipDate?
att.partNum = attribute partNum { text }
productName = element productName { text }
quantity = element quantity { xsd:positiveInteger {maxExclusive="100"}}
USPrice = element USPrice { xsd:decimal }
comment = element comment { text }
shipDate = element shipDate { xsd:date }

start = Items

All of these schema approaches facilitate automated processing of XML. Still, the various XML Schema tools can't replace the functionality provided by the RDF specification. To overgeneralize, XML tools are concerned with describing markup representations and their contents, while RDF tools are concerned with describing models. You can get a model from a representation or vice versa, but the two approaches focus on different things.

The RDF specification defines information about data within a particular context. It provides a means of recording information at a metadata level that can be used regardless of the domain. RDF's relationship with XML is that XML is used to serialize an RDF model; RDF is totally unconcerned whether XML is valid (that is, conforming to a DTD, RELAX NG description, or W3C XML Schema) as long as the XML used to serialize an RDF model is well formed. In addition, concepts such as data types and complex and simple element structures?focal points within the W3C XML Schema?again focus on XML as used to define data, primarily for data interchange; they have nothing to do with recording data about data in order to facilitate intelligent web functionality.