13.4 Extending the Specification Through Modules

When writing a specification, or a standard, the authors can take one of two approaches: they can try to capture the entire world encompassed by the specification, a process that can take years, or they can create a specification that has a minimal set of elements and provide a mechanism to allow for extensions. The RSS Working Group opted for the latter option?start small, and provide a carefully defined extension mechanism. For RSS, the extension mechanism is the module.

RSS modules are sets of elements that are delimited from other modules through use of XML namespaces. Different modules can have the same element, and both can be used in RSS without fear of collision as long as each module has its own namespace. The following is the namespace declaration for the Syndication module:

xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"

The use of namespaces in RSS is no different than the use of namespaces in more general RDF. The primary difference between the two is that new namespaces are generally used to define relatively complete RDF vocabularies.

According to the RSS 1.0 modules guide (found at http://groups.yahoo.com/group/rss-dev/files/Modules/modules.html), module designers should narrow the focus of their module to a specific need. The premise behind this is that many small modules are more manageable and more targeted than a few big, all-encompassing modules. The guidelines also recommend following a simple (flat) model for new modules over a rich (nested, complex) module whenever possible, so that modules are more easily mixed and managed together.

One final rule for module developers is a fairly significant one that has to do with the rdf:parseType attribute; if this attribute is set to a value of "Literal", then it can contain any type of XML including non-RDF-compliant XML. The reason for this "loophole" in the RSS specification is to allow modules to be added without strict compliance to the rules governing the use of RDF.

An example of the use of rdf:parseType within RSS is the following, pulled from the modules guideline document:

<dc:creator rdf:parseType="Literal">
  <name>
    <firstname>John</firstname>
    <middle_initial>Q.</middle_initial>
    <lastname>Public</lastname>
  </name>
</dc:creator>

In this example, the data contained within the dc:creator element is treated as a literal and the RSS/RDF parsers will return all of the data as one large string.

The use of parseType="Literal" cuts off the XML contained within the element from full integration into the RDF, because individual elements contained in the data aren't discretely accessible. In my opinion, this shortcut defeats the purpose of having a metalanguage.