3.8 RDF/XML: Separate Documents or Embedded Blocks

By convention, RDF/XML files are stored as separate documents and given the extension of .rdf (just rdf for Mac systems). The associated MIME type for an RDF/XML document is: application/rdf+xml.

There's been considerable discussion about embedding RDF within other documents, such as within non-RDF XML and HTML. I've used RDF embedded within HTML pages, and I know other applications that have done the same.

The problem with embedding, particularly within HTML documents, is that it's not a simple matter to separate the RDF/XML from the rest of the content. If the RDF/XML used consists of a resource and its associated properties listed as attributes of the resource, this isn't a problem. An example of this would be:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    dc:title="Good RSS"
    dc:description="Mark Pilgrim and Sam Ruby created an RSS Validator for us to use 
to validate our RSS feeds, and Bill Kearney was kind enough to host it. Many 
appreciations, folks. I ran the Validator against my RSS feeds (both Userland..."
    dc:date="2002-10-2209:46:26-06:00" />

This is RDF/XML that's generated by a weblogging tool called Movable Type (found at http://moveabletype.org). It's used for the tool's trackback feature, which allows webloggers to notify each other when they reference each other's posts in their own.

All of the data is contained in RDF/XML element attributes. Including all of the properties as attributes means that there is no visible XML content contained within any element and therefore parsed by the HTML parser and displayed in the page?all of the data is contained in RDF/XML element attributes.

This is pretty handy, but not all RDF/XML can use the abbreviated syntax that allows us to convert RDF properties to XML attributes. In those cases, the approach I use to embed RDF within an HTML document is to include it within script tags, as demonstrated in Example 3-21.

Example 3-21. Embedding RDF in HTML script elements
<script type="application/rdf+xml">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    dc:title="Apple&apos;s Open Core"
    dc:description="As happened last year with the Macworld conference, you 
might as well bag writing about anything else because this week will be 
Apple, Apple, Apple. Two big stories - a newer, longer TiBook and Safari, 
Apple&apos;s entry into the browsing market. I liked some features of the 
new TiBook such as the backlit keyboard, which I think is one of the best 
ideas I&apos;ve heard with a laptop; I know I wish I had this with my 
TiBook. However, I&apos;m less impressed with the length of the TiBook - 
17 inches. My 15 inch works nicely, I drag it about the house and everywhere 
I go with no effort. All that extra length with the new TiBook does is make 
it too long for most computer carry bags. Heck, it&apos;s too long for most
laps. What Apple needs to do is incorporate all the other goodies into its 
15 inch model. Including the airport, Bluetooth, the graphics card, and that
nifty backlit feature. That would be a tasty morsel, and I&apos;d be putting 
up a PayPal donation button to have you all buy it for me. And the Titanium 
PowerBooks are still the sexiest computer on earth. An even bigger..."
    dc:date="2003-01-08T09:34:36-06:00" />

The HTML parser ignores the script contents, assuming that the text/rdf content will be processed by some application geared to this data type. This approach works rather well except for one thing: it doesn't allow an HTML page to validate as XHTML. And many organizations insist that web pages validate as XHTML.

To allow the page with the embedded RDF to validate, you can then surround the contents with HTML comments:

<!-- --
-- -->

Unfortunately, HTML comments are also XML comments, and any content within them tends to be ignored by most XML parsers, including RDF/XML parsers.

Until XML can be embedded into an XHTML document in such a way that allows the page to be validated, the only approach you can take for the RDF data is to include it in an external RDF document and then link the document into the XHTML page using the link element:

<link rel="meta" type="application/rdf+xml" title="RSS" 
href="http://burningbird.net/index.rdf" />

Another approach is to embed the RDF/XML into the XHTML using comments but to pull this data out and feed it directly to an RDF/XML parser. It's a bit cumbersome, but doable, especially since most screen-scraping technologies such as Perl's LWP provide for finding specific blocks of data and grabbing them directly.