13.5 The RSS Modules

RSS consists of three basic or core modules as well as the module extension mechanism just described. For the most part, the three core modules will fit most data needs. However, specialized business data and/or processing may require new elements.

13.5.1 Core: Syndication, Content, and Dublin Core

The Syndication module provides information, such as the update frequency of the data, for tool builders. Rather than an RSS aggregator having to test a data source at set time periods, it can access the Syndication data and update only when the data is scheduled to change.

Table 13-1 contains the Syndication elements. The namespace for Syndication is xmlns:sy="http://purl.org/rss/1.0/modules/syndication/". The elements are subelements of the channel element, which means they apply to the data in the RSS document as a whole, rather than individual items.

Table 13-1. Syndication elements

Element

Purpose

Data

updatePeriod

Frequency of update of data

Hourly | daily | weekly | monthly | yearly

updateFrequency

Frequency of updates within time period

Integer

updateBase

Based date combined with period and frequency to determine updates

PCDATA

Example 13-4 shows a simplified RDF/RSS file demonstrating the use of the Syndication elements. This file is actually generated from a merge of several RDF/RSS files using an application built in Perl that I'll demonstrate later in the chapter. I've simplified the file to show only one item to restrict the size of the example. The Syndication elements are bolded.

Example 13-4. RSS demonstrating use of Syndication Elements
<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns="http://purl.org/rss/1.0/"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/"
 xmlns:syn="http://purl.org/rss/1.0/modules/syndication/"
>

<channel rdf:about="http://burningbird.net">
	<title>Burningbird Network</title>
	<link>http://burningbird.net</link>
	<description>Burningbird: Burning online since 1995</description>
	<dc:language>en-us</dc:language>
	<dc:rights>Copyright 1995-2003, Shelley Powers, Burningbird</dc:rights>
	<dc:publisher>shelleyp@burningbird.net</dc:publisher>
	<dc:creator>shelleyp@burningbird.net</dc:creator>
	<dc:subject>writing,technology,art,photography,science,environment,politics</dc:
subject>
	<syn:updatePeriod>hourly</syn:updatePeriod>
	<syn:updateFrequency>1</syn:updateFrequency>
	<syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
	<items>
		<rdf:Seq>
			<rdf:li rdf:resource="http://rdf.burningbird.net/archives/000853.htm" />
		</rdf:Seq>
	</items>
	<image rdf:resource="http://burningbird.net/mm/birdflame.gif" />
</channel>
<image rdf:about="http://burningbird.net/mm/birdflame.gif">
	<title>Burningbird</title>
	<url>http://burningbird.net/mm/birdflame.gif</url>
	<link>http://burningbird.net/</link>
	<dc:creator>Shelley Powers</dc:creator>
</image>
<item rdf:about="http://rdf.burningbird.net/archives/000853.htm">
	<title>Corrected chapters 6 and 9 uploaded</title>
	<link>http://rdf.burningbird.net/archives/000853.htm</link>
	<description>I found some small errors in the schema from chapters 6 and 9 and have 
uploaded corrected chapters for both.
	</description>
	<dc:creator>yasd</dc:creator>
	<dc:date>2003-01-25T10:22:02-06:00</dc:date>
</item>
</rdf:RDF>

The Content module provides information about the format of the data contained in the RSS document. This includes space for comments about the data. The namespace declaration is xmlns:content="http://purl.org/rss/1.0/modules/content".

Table 13-2 lists the Content elements. Note that at the time of this writing, the content:encoded element hadn't yet been approved by the RSS Working Group.

Table 13-2. Content module elements

Element

Purpose

Data

items

Container for item

Subelement of RSS item or channel

item

Provides a description of containing element

PCDATA

format

Format of item

Empty element with rdf:resource pointing to URI of format

rdf:value

Used if URI is not provided with content:item

CDATA

encoding

Encoding of item

Optional empty element with rdf:resource pointing to URI of encoding format

The content:encoded element is used to wrap the individual RSS feed item as CDATA-encoded values. Rather than provide excerpts, content:encoded tends to have the entire article or item, rather than just one excerpt. Of course, one of the problems with something such as this is republication rights?if the complete article is provided in the RSS feed, can it be republished in an aggregation that is publicly accessible?

The issue of republication and rights is covered when we look at Creative Commons licensing and its use of RDF/XML in Chapter 14.

The Dublin Core RSS module is an RSS-standardized version of the Dublin Core RDF (discussed in Chapter 6). The RSS namespace for this module is xmlns:dc="http://purl.org/dc/elements/1.1/". Though made available as part of the RSS specification, the elements as described in Chapter 6 are no different here, so I won't repeat the list. An example of their use with RSS can be seen in the RSS associated with my weblog feed:

<item rdf:about="http://weblog.burningbird.net/archives/000479.php">
<title>Today is for Working</title>
<description>
The best part of getting up early is watching the sun rise. 
Today is for working, it is. Today is for nose down and 
finishing tasks and making milestones. I've marked out in my mind tasks 
to accomplish with each...
</description>
<link>http://weblog.burningbird.net/archives/000479.php</link>
<dc:subject>Life in General</dc:subject>
<dc:creator>shelley</dc:creator>
<dc:date>2002-08-26T05:13:03-06:00</dc:date>
</item>

The Dublin Core elements are also used in my Favorite Books application, described in Section 13.8.

Currently the Dublin Core elements usually contain CDATA values (i.e., string literals), but the data type definition for the elements is all PCDATA. The Working Group is looking at the possibility of merging the use of Dublin Core RSS elements with that of the Taxonomy module (discussed in the next section).

The Dublin Core elements are fully defined in the document, "Dublin Core Element Set, Version 1.1" located at http://dublincore.org/documents/1999/07/02/dces/.

13.5.2 Extended Modules

Several extended RSS modules describe information that ranges from discussion threads to companies to email and taxonomies. At the time of this writing, the only modules that have been accepted as standard are those just described: the core modules Syndication, Content, and Dublin Core.

Approved and pending modules can be found at http://web.resource.org/rss/1.0/modules/ and as part of the dmoz Open Directory project, http://dmoz.org/Reference/Libraries/Library_and_Information_Science/Technical_Services/Cataloguing/Metadata/RDF/Applications/RSS/Specifications/RSS1.0_Modules/.

I won't list each individual module as none been accepted fully into the RSS 1.0 specification, and the list is changeable. However, before proceeding to look at the tools that work with RSS 1.0, I want to digress for a moment and talk about the concepts behind extended modules.

The idea of using modules to extend the RSS 1.0 specification, without having to modify or edit the specification directly, is a good one; new modules, such as one for slashdot.com-related data (mod_slash), mod_dcterms for Qualified Dublin Core metadata, and linking (mod_link) are based on this. By using namespaces to differentiate between the modules, one can easily add a new module without impacting on the others. However, this very simplicity is a danger in and of itself.

The whole purpose behind namespace support for RDF was to allow the combination of multiple RDF vocabularies without collision between vocabulary elements. The assumption behind this was that each vocabulary was both comprehensive and complete in its description of the business data the vocabulary defines.

However, there is a subtle difference between this original purpose for namespaces in RDF (and in XML) and their use in RSS 1.0. Rather than namespaces being seen as a way of combining rich and complete vocabularies, they're seen as a way of adding new data easily and quickly?and that's not always a good thing.

A case in point: during the discussions related to RDF/RSS (RSS 1.0) and non-RDF RSS (Userland RSS or RSS 2.0), one issue that came up was how to redirect RSS aggregators to new locations for feeds when the feeds were moved. Suggestions were made to use mod_dcterms for this purpose, but this wasn't necessarily compatible with the non-RDF RSS. Another suggestion was to create a module that contained one element for RDF RSS and one element for non-RDF RSS. This was a clear violation of both the concepts and the philosophy for namespaces within RDF (or within the larger world of XML).

The idea wasn't followed through on but does demonstrate the danger behind viewing modules as workarounds rather than complete and independent RDF vocabularies that can exist outside of the RSS 1.0 specification.