8.1 Overview of the Classes

Included with the Jena toolkit are the dependencies and installation instructions, which I won't repeat here. I have worked with Jena on Linux (Red Hat), FreeBSD, and Windows; the examples included with Jena and the examples in this chapter work equally well in all environments. The only requirement is that you use JRE 1.2 or above.

A description of the many Java classes included with Jena is included with the installation (as Javadocs). I won't cover all of them here, only those most critical to understanding the underlying architecture in Jena.

I used Jena 1.6.1 in this chapter, but by the time this book is out, Jena 2.0 should be available. The Jena developers are refactoring many of the classes, changing class structure as well as making modifications to the API itself. These changes will break these examples, unfortunately. However, the concepts behind the examples should stay the same, and the book support site will have updated example source.

8.1.1 The Underlying Parser

Included within the Jena toolset is an RDF parser, ARP (an acronym for Another RDF Parser), accessible as a standalone product. You had a chance to look at and work with ARP in Chapter 7, so I won't go into additional detail here, since it works in the background with no further intervention necessary on our part. Our work begins once the RDF data is loaded into a model.

Though not covered in this book, Jena also includes an N3 (Notation3) parser.

8.1.2 The Model

Jena's API architecture focuses on the RDF model, the set of statements that comprises an RDF document, graph, or instantiation of a vocabulary. A basic RDF/XML document is created by instantiating one of the model classes and adding at least one statement (triple) to it. To view the RDF/XML, read it into a model and then access the individual elements, either through the API or through the query engine.

The ModelMem class creates an RDF model in memory. It extends ModelCom?the class incorporating common model methods used by all models?and implements the key interface, Model. In addition, the DAML class, DAMLModelImpl, subclasses ModelMem.

The ModelRDB class is an implementation of the model used to manipulate RDF stored within a relational database such as MySQL or Oracle. Unlike the memory model, ModelRDB persists the RDF data for later access, and the basic functionality between it and ModelMem is opening and maintaining a connection to a relational database in addition to managing the data. An interesting additional aspect of this implementation, as we'll see later in Section 8.4, is that you can also specify how the RDF model is stored within a relational database?as a flat table of statements, as a hash, or through stored procedures.

Once data is stored in a model, the next step is querying it.

One major change with Jena 2.0 is the addition of the ModelFactory to create new instances of models.

8.1.3 The Query

You can access data in a stored RDF model directly using specific API function calls, or via RDQL?an RDF query language. As will be demonstrated in Chapter 10, querying data using an SQL-like syntax is a very effective way of pulling data from an RDF model, whether that model is stored in memory or in a relational database.

Jena's RDQL is implemented as an object called Query. Once instantiated, it can then be passed to a query engine (QueryEngine) and the results stored in a query result (QueryResult and various implementations: QueryResultsFormatter, QueryResultsMem, and QueryResultsStream). To access specific returned values, program variables are bound to the result sets using the ResultBinding class.

Once data is retrieved from the RDF/XML, you can iterate through it using any number of iterators. Once you query data using the Query object, or if you access all RDF/XML elements of a specific class, you can assign the results to an iterator object and iterate through the set, displaying the results or looking for a specific value. Each of several different iterator classes within Jena is focused on specific RDF/XML classes, such as NodeIterator for general RDF nodes (literal or resource values), ResIterator, and StmtIterator.

8.1.4 DAML+OIL

Starting with later versions of Jena, support for DAML+OIL was added to the tool suite. DAML+OIL is a language for describing ontologies, a way of describing constraints and refinements for a given vocabulary that are beyond the sophistication of RDFS. Much of the effort on behalf of the Semantic Web is based on the Web Ontology Language at the W3C, which owes much of its effort to DAML+OIL. The principle DAML+OIL class within Jena, outside of the DAMLModel, is the DAMLOntology class. I won't be covering the DAML+OIL classes in this chapter, but the creators of Jena provide a tutorial that demonstrates them and is included in the documents you get when you download Jena.

Ontologies, DAML+OIL, and the W3C ontology language effort, OWL, are described in Chapter 12.