9.3 RDF and Python: RDFLib

It would be difficult not to see the natural fit between Python and RDF. Of course, Python programmers would say the same happens with all uses of Python, but when you see how quick and simple it is to build an RDF/XML model from scratch using the Python RDF library, RDFLib, you might think about switching regardless of what language you normally use.

RDFLib was created by Daniel Krech. Download the most recent release of RDFLib at http://rdflib.net. I used RDFLib 1.2.3 on Windows 2000 when writing this section. RDFLib requires Python 2.2.1 or later. Additional software is required if you want to use the rdflib.Z informationStore, providing support for contexts in addition to persistent triples.

RDFLib is actually a part of a larger application framework, Redfoot, discussed in Chapter 12. However, RDFLib is a separate, fully RDF functional API. If there's any additional need with the API, it's documentation, which is quite scarce for the product. However, the libraries are so intuitive, one could almost say that the documentation isn't needed.

All the unique components of an RDF model have been defined as Python objects in RDFLib:

RDFLib.URIRef: A resource with a URI
RDFLib.BNode: A resource without a URI
RDFLib.Literal: A literal
RDFLib.Namespace: Manage a namespace
TripleStore: In-memory triple store

In addition, RDFLib.constants contains definitions for the RDF properties such as type and value.

Example 9-7 implements a subgraph of the test RDF/XML document (monsters1.rdf) defined in the following snippet of XML:

<pstcn:Resource rdf:about="monsters1.htm">
   <pstcn:presentation rdf:parseType="Resource">
      <pstcn:requires rdf:parseType="Resource">
        <pstcn:type>stylesheet</pstcn:type>
        <rdf:value>http://burningbird.net/de.css</rdf:value>
      </pstcn:requires>
   </pstcn:presentation>
</pstcn:Resource>

To begin, a Namespace object is created for the PostCon namespace, in addition to a TripleStore used for the model in progress. Following this, the top-level resource is created using URIRef, which is then added as a triple with the RDF type and the PostCon Document type. After that, it's just a matter of creating the appropriate type of object and adding more triples. Note that Namespace manages the namespace annotations for all of the objects requiring one, such as all of the predicates. At the end, the triples are printed out to standard output, and the model is serialized to RDF/XML.

Example 9-7. Building a graph using RDFLib

from rdflib.URIRef import URIRef
from rdflib.Literal import Literal
from rdflib.BNode import BNode
from rdflib.Namespace import Namespace
from rdflib.constants import TYPE, VALUE

# Import RDFLib's default TripleStore implementation
from rdflib.TripleStore import TripleStore

# Create a namespace object
POSTCON = Namespace("http://burningbird.net/postcon/elements/1.0/")

store = TripleStore(  )

store.prefix_mapping("pstcn", "http://http://burningbird.net/postcon/elements/1.0/")
 
# Create top-level resource
monsters = URIRef(POSTCON["monsters1.htm"])

# Add type statement
store.add((monsters, TYPE, POSTCON["Document"]))

# Create bnode and add as statement
presentation = BNode(  );
store.add((monsters, POSTCON["presentation"],presentation))

# Create second bnode, add
requires = BNode(  );
store.add((presentation, POSTCON["requires"], requires))

# add two end nodes
type = Literal("stylesheet")
store.add((requires, POSTCON["type"],type))

value = Literal("http://burningbird.net/de.css")
store.add((requires, VALUE, value))

# Iterate over triples in store and print them out
for s, p, o in store:
    print s, p, o

# Serialize the store as RDF/XML to the file subgraph.rdf
store.save("subgraph.rdf")

Just this small sample demonstrates how simple RDFLib is to use. The generated RDF/XML looks similar to the following, indentation and all, which is a nice little feature of the library.

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
   xmlns:n4="http://burningbird.net/postcon/elements/1.0/"
   xmlns:pstcn="http://http://burningbird.net/postcon/elements/1.0/"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
  <n4:Document rdf:about="http://burningbird.net/postcon/elements/1.0/monsters1.htm">
    <n4:presentation>
      <rdf:Description>
        <n4:requires>
          <rdf:Description>
            <n4:type>stylesheet</n4:type>
            <rdf:value>http://burningbird.net/de.css</rdf:value>
          </rdf:Description>
        </n4:requires>
      </rdf:Description>
    </n4:presentation>
  </n4:Document>
</rdf:RDF>

Testing this in the RDF Validator results in a directed graph equivalent to the subgraph found in the larger model, and equivalent to the graph generated earlier in the chapter with the Perl modules.

You can also load an existing RDF/XML document into a TripleStore and then run queries against the triples. Example 9-8 contains a small Python application that loads monsters1.rdf into a TripleStore and then looks for all subjects of class Movement. These are passed into an inner loop and used to look up the movement type for each Movement.

Example 9-8. Finding all movements and movement types in RDF/XML document

from rdflib.Namespace import Namespace
from rdflib.constants import TYPE

# Import RDFLib's default TripleStore implementation
from rdflib.TripleStore import TripleStore

# Create a namespace object 
POSTCON = Namespace("http://burningbird.net/postcon/elements/1.0/")
DC = Namespace("http://purl.org/dc/elements/1.1/")

store = TripleStore(  )
store.load("http://burningbird.net/articles/monsters1.rdf");

# For each pstcn:Movement print out movementType
for movement in store.subjects(TYPE, POSTCON["Movement"]):
    for movementType in store.objects(movement, POSTCON["movementType"]):
        print "Moved To: %s Reason: %s" % (movement, movementType)

This application prints out the movement resource objects as well as the movement types:

Moved To: http://burningbird.net/burningbird.net/articles/monsters1.htm Reason:
Move
Moved To: http://www.yasd.com/dynaearth/monsters1.htm Reason: Add
Moved To: http://www.dynamicearth.com/articles/monsters1.htm Reason: Move

The TripleStore document triple_store.html in the RDFLib documentation describes the TripleStore.triples method and the variations on it that you can use for queries. The method used differs but the basic functionality remains the same as that just demonstrated.

Another open source and Python-based RDF API is 4RDF and its Versa query language, a product of Fourthought. 4RDF is part of 4Suite, a set of tools for working with XML, XSLT, and RDF. More information is available at http://fourthought.com.