9.2 RDF API for PHP

Few languages have achieved faster acceptance than PHP. ISPs now install support for PHP when they install Apache, so most people have access to this server-side tag-based scripting language. And where there's scripting, there's support for RDF. PHP boasts two RDF APIs: the RDF API for PHP and the RDF-specific classes within the PHP XML classes. The latter is covered in the next chapter; this chapter focuses on the RDF API for PHP, which I'll refer to as RAP for brevity.

The RDF API for PHP (RAP) home page is at http://www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/. The SourceForge project for the API is at http://sourceforge.net/projects/rdfapi-php/.

9.2.1 Basic Building Blocks

The RAP classes are split into three main packages: model, syntax, and util. The model package includes all the classes to create or read specific elements of an RDF model, including reading or creating complete statements from a model or their individual components. These classes are:

BlankNode

Used to create a blank node, to get the bnode identifier, or check equality between two bnodes

Literal

Support for model literals

Model

Contains methods to build or read a specific RDF model

Node

An abstract RDF node

Resource

Support for model resources

Statement

Creating or manipulating a complete RDF triple

RAP doesn't, at this time, support persistence to a database such as MySQL or Berkeley DB, but you can serialize the data through RdfSerializer, which is one of the two syntax classes. To read a serialized model, you would then use the other syntax class, RdfParser.

The util class Object is another abstract class with some general methods overloaded in classes built on it, so it's of no interest for our purposes. However, the RDFUtil class provides some handy methods, including the method writeHTMLTable to output an RDF/XML document in nice tabular form.

9.2.2 Building an RDF Model

Creating a new RDF model and adding statements to it using RAP is extremely easy. Start by creating a new RDF graph (data model) and then just add statements to it, creating new resources or literals as you go. The best way to see how to create a new graph is to look at a complete example of creating a model and then outputting the results to a page.

In the first example of this API, the path from the top-level resource all the way through the first movement is created as a subgraph of the larger monsters1.rdf model. Since movements in this model are coordinated through an RDF container, rdf:Seq, information related to the container must also be added to ensure that the generated RDF/XML maps correctly to the original RDF/XML of the full model. The N-Triples for just this path, as generated by the RDF Validator, are:

<http://burningbird.net/articles/monsters1.htm> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://burningbird.net/postcon/
elements/1.0/Document> .
<http://burningbird.net/articles/monsters1.htm> <http://burningbird.net/postcon/
elements/1.0/history> _:jARP31427 .
_:jARP31427 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq> .
_:jARP31427 <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> <http://www.yasd.com/
dynaearth/monsters1.htm> .
<http://www.yasd.com/dynaearth/monsters1.htm> <http://burningbird.net/postcon/
elements/1.0/movementType> "Add" .

In the script, the first two lines map the RDF API directories and should reflect your own installation. This test script was built on a Linux box, which the path to the API reflects. Following the global directory definitions, a new model, as well as the top-level resource (since this will be used more than once in the page), is created. Added to the new model is a new statement consisting of the top-level resource as the subject, a new resource created for the predicate, and the object. In this case, the top-level resource is defined as a PostCon Document class.

Following the initial statement, a blank node is created to represent the rdf:Seq object using the label history, and a type resource identifying it as rdf:Seq is added to the model. The first of the movements is added using the container element URI and giving as object the URI of the movement object. In the last statement, the movementType property is added for this resource, as shown in Example 9-5. To observe the resulting model, it's serialized using the RDFUtil::writeHTML class, to generate a table of statements. And then the model is serialized to RDF/XML, using the RDFSerializer class.

Example 9-5. Creating an RDF model using RDF API for PHP and serializing it to the page
<?php
define("RDFAPI_INCLUDE_DIR", "./../api/");
include(RDFAPI_INCLUDE_DIR . "RDFAPI.php");

// New Model, set base URI
$model = new Model(  );
$model->setBaseURI("http://burningbird.net/articles/");

// first statement
$mainsource = new Resource("monsters1.htm");
$model->add(new Statement($mainsource, $RDF_type,
            new
Resource("http://burningbird.net/postcon/elements/1.0/Document")));

$history = new BlankNode("history");
$model->add(new Statement($mainsource,
             new
Resource("http://burningbird.net/postcon/elements/1.0/history"),
             $history));

// Define RDF Bag
$model->add(new Statement($history, $RDF_type, $RDF_Seq));

$movement = new Resource("http://www.yasd.com/dynaearth/monsters1.htm");
$model->add(new Statement($history,
            new Resource(RDF_NAMESPACE_URI . "_1"),
            $movement));

$model->add(new Statement($movement,
         new
Resource("http://burningbird.net/postcon/elements/1.0/movementType"),
         new Literal("Add", "en")));

// Output as table
RDFUtil::writeHTMLTable($model);

file://Serialize and output model
$ser = new RDFSerializer(  );
$ser->addNamespacePrefix("pstcn",
"http://burningbird.net/postcon/elements/1.0/");
$rdf =& $ser->serialize($model);
echo "<p><textarea cols='110' rows='20'>" . $rdf . "</textarea>";

file://Save the model to a file
$ser->saveAs($model,"rdf_output.rdf");

?>

When this script is included within HTML and accessed via the Web, the result looks similar to Figure 9-1.

Figure 9-1. Page resulting from running PHP script in Example 9-5
figs/prdf_0901.gif

If you want to persist the serialized result of the model, use PHP's own file I/O functions to save the generated RDF/XML to a file. Note that the figure shows bnodes as URI, which isn't proper format. However, this is an internally generated value that has no impact on the validity of the RDF/XML.

Example 9-6 contains the script to open this serialized RDF/XML and iterate through it (this script was provided by the RAP creator, Chris Bizer).

Example 9-6. Iterating through the serialized RDF/XML created in Example 9-5
<?php

// Include RDF API
define("RDFAPI_INCLUDE_DIR", "./../api/");
include(RDFAPI_INCLUDE_DIR . "RDFAPI.php");

// Create new Parser
$parser = new RdfParser(  );

// Parse document
$model =& $parser->generateModel("rdf_output.rdf");

// Get StatementIterator
$it = $model->getStatementIterator(  );

// Traverse model and output statements
while ($it->hasNext(  )) {
   $statement = $it->next(  );
   echo "Statement number: " . $it->getCurrentPosition(  ) . "<BR>";
   echo "Subject: " . $statement->getLabelSubject(  ) . "<BR>";
   echo "Predicate: " . $statement->getLabelPredicate(  ) . "<BR>";
   echo "Object: " . $statement->getLabelObject(  ) . "<P>";
}

?>

You can add or subtract statements on a given model, check to see if the model contains a specific statement, and even find the intersection or combination of multiple models, using the Model class. However, one of the most frequent activities you'll likely do is query the model.

Querying a Model

The Model class in RAP has a couple of different methods you can use to find information. For instance, the findVocabulary method returns all triples from a given vocabulary, as identified by a namespace. This is rather handy if your document combines elements from many different namespaces.

Two other methods allow for more fine-grained queries: find and findRegex.

The find method takes three parameters: subject, predicate, and object. Passing in null for a specific parameter matches any value for that component in the triple. The findRegex method uses a Perl-style regular expression to check for a match in any of the components. Both methods return a new Model, which you can print out using the RDFUtil method writeHTMLTable. However, if you want to print the data out using your own approach or want to print out only specific components in the resulting triple, you'll have to do a little more work, and will use private methods and members of the RAP class. This makes me hesitant to use RAP for querying.

What I've done is mix PHP classes when working with RDF. I use RAP to create RDF models, and I then use the PHP XML classes, described in the next chapter, to persist the RDF/XML to a database and to use RDQL queries to query that database.