3.6 Application Development

This section presents a quick introduction into programming applications with eXist. We first look at the XML:DB API, which is mainly of interest to Java developers. To those who prefer other programming languages, eXist offers SOAP and XML-RPC interfaces. A small SOAP example using .NET and C# is provided later in this section. Finally, we will see how eXist integrates with Apache's Cocoon.

Due to space restrictions, the examples are simple. The distribution contains more complex examples for each of the APIs discussed here.

3.6.1 Programming Java Applications with the XML:DB API

The preferred way to access eXist from Java applications is to use the XML:DB API. The XML:DB API provides a common interface to native or XML-enabled data bases and supports the development of portable, reusable applications.

The vendor-independent XML:DB initiative tries to standardize a common API for access to XML database services, comparable to JDBC or ODBC (Open Database Connectivity) for relational database systems. The API is built around four core concepts: drivers, collections, resources, and services. Drivers encapsulate the whole database access logic. They are provided by the database vendor and have to be registered with the database manager. As in eXist, collections are hierarchical containers, containing other collections or resources. A resource might be either an XML resource or a binary large object. Other types of resources might be added in future versions of the standard. Currently, eXist supports only XML resources. Finally, services may be requested to perform special tasks like querying a collection with XPath or managing collections.

Every application using the XML:DB API has to first obtain a collection object from the database. This is done by calling the static method getCollection of class DatabaseManager. To locate the specified collection, the method expects a fully qualified URI (Uniform Resource Identifier) as parameter, which identifies the database implementation, the collection name, and optionally the location of the database server on the network.

For example, the URI "xmldb:exist:///db/shakespeare" references the Shakespeare collection of an eXist database running in embedded mode. Internally, eXist has two different driver implementations: The first talks to a remote database engine using XML-RPC calls; the second has direct access to a local instance of eXist. Which implementation will be selected depends on the URI passed to the getCollection method. To reference the Shakespeare collection on a remote server, you use "xmldb:exist://localhost:8080/exist/xmlrpc/db/shakespeare".

The DriverManager keeps a list of available database drivers and uses the database ID specified in the URI ("exist") to select the correct driver class. Drivers may be registered for different databases by calling DatabaseManager.registerDatabase at the start of a program. For example, the code fragment shown in Listing 3.1 registers a driver for eXist.

Listing 3.1 Registering a Driver for eXist

Class cl = Class.forName("org.exist.xmldb.DatabaseImpl");
Database driver = (Database)cl.newInstance();
DatabaseManager.registerDatabase(driver);

Once you have obtained a valid collection object, you may browse through its child collections and resources, retrieve a resource, or request a service. Listing 3.2 presents a complete example, which sends an XPath query passed on the command line to the server.

Listing 3.2 Querying the Database

import org.xmldb.api.base.*;
import org.xmldb.api.modules.*;
import org.xmldb.api.*;

public class QueryExample {
  public static void main(String args[]) throws Exception {
     String driver = "exist.xmldb.DatabaseImpl";
     Class cl = Class.forName(driver);
     Database database = (Database)cl.newInstance();
     database.setProperty("create-database", "true");
     DatabaseManager.registerDatabase(database);

     Collection col =
        DatabaseManager.getCollection("xmldb:exist:///db");
          (XPathQueryService) col.getService("XPathQuery-Service", "1.0");
     service.setProperty("pretty", "true");
     service.setProperty("encoding", "ISO-8859-1");
     ResourceSet result = service.query(args[0]);

     ResourceIterator i = result.getIterator();
     while(i.hasMoreResources()) {
        Resource r = i.nextResource();
        System.out.println((String)r.getContent());
     }
}

The sample code registers the driver for a locally attached eXist database and retrieves the root collection. By setting the "create-database" property to "true", the database driver is told to create a local database instance if none has been started before. A service of type "XPathQueryService" is then requested from the collection. To execute the query, method service.query(String xpath) is called. This method returns a ResourceSet, containing the query results. Every resource in the ResourceSet corresponds to a single result fragment or value. To iterate through the resource set we call result.getIterator().

To have the client query a remote database, the URI passed to DatabaseManager.getCollection() has to be changed as explained previously: For example, to access a database engine running via the Tomcat Web server, use DatabaseManager.getCollection("xmldb:exist://localhost:8080/exist/xmlrpc/db").

3.6.2 Accessing eXist with SOAP

In addition to the XML:DB API for Java development, eXist provides XML-RPC and SOAP interfaces for communication with the database engine. This section presents a brief example of using SOAP to query the database. A description of the XML-RPC interface is available in the "Developer's Guide" that can be found on the home page of eXist.

XML-RPC as well as SOAP client libraries are available for a large number of programming languages. Each protocol has benefits and drawbacks. While a developer has to code XML-RPC method calls by hand, programming with SOAP is usually more convenient, because most SOAP tools will automatically create the low-level code from a given WSDL (Web Services Description Language) service description. Additionally, SOAP transparently supports user-defined types. Thus the SOAP interface to eXist has a slightly cleaner design, because fewer methods are needed to expose the same functionality. On the other hand, SOAP toolkits tend to be more complex.

eXist uses the Axis SOAP toolkit from Apache, which runs as a servlet. Once the Tomcat Web server contained in the eXist distribution has been started, the developer is ready to access eXist's SOAP interface. Two Web services are provided: The first allows querying of the server; the second is used to add, view, and remove documents or collections.

Listing 3.3 presents an example of a simple client written in C# with Microsoft's .NET framework. For this example, the wsdl.exe tool provided with .NET to automatically generate a client stub class from the WSDL Web service definition (query.wsdl) was used, which describes eXist's query service. The tool produces a single file (QueryService.cs), which has to be linked with the developer's own code. Using the automatically generated classes, all the SOAP-related code is hidden from the developer. The developer does not even have to know that a remote Web service is being accessed.

Listing 3.3 Simple .NET Client in C#

using System;

public class SoapQuery {
    static void Main(string[] args) {
        string query;
        if(args.Length < 1) {
            Console.Write("Enter a query: ");
            query = Console.ReadLine();
        } else
            query = args[0];
        QueryService qs = new QueryService();

        // execute the query
        QueryResponse resp = qs.query(query);
        Console.WriteLine("found: {0} hits in {1} ms.", resp.hits,
        resp.queryTime);

        // print a table of hits by document for every collection
        foreach (QueryResponseCollection collection in resp.collections) {
            Console.WriteLine(collection.collectionName);
            QueryResponseDocument[] docs = collection.documents;
            foreach (QueryResponseDocument doc in docs)
                Console.WriteLine('\t' + doc.documentName. PadRight(40, '.') +
                    doc.hitCount.ToString().PadLeft(10, '.'));
        }
        // print some results
        Console.WriteLine("\n\nRetrieving results 1..5");
        for(int i = 1; i <= 5 && i <= resp.hits; i++) {
            byte[] record = qs.retrieve(resp.resultSetId, i, "UTF-8", true);
            string str = System.Text.Encoding.UTF8.GetString (record);
            Console.WriteLine(str);
        }
    }
}

The client simply instantiates an object of class QueryService, which had been automatically created from the WSDL description. The XPath query is passed to method query, which returns an object of type QueryResponse. QueryResponse contains some summary information about the executed query. The most important field is the result-set id, which is used by the server to identify the generated result-set in subsequent calls. To actually get the query results, the retrieve method is called with the result-set id. The XML is returned as a byte array to avoid possible character-encoding conflicts with the SOAP transport layer.

3.6.3 Integration with Cocoon

The combination of eXist and Cocoon opens a wide range of opportunities, starting from simple Web-based query interfaces to?from a future perspective?a complete content management system. Cocoon provides a powerful application server environment for the development of XML-driven Web applications, including configurable transformation pipelines, XML server pages for dynamic content generation, and output to HTML, PDF, SVG (Scalable Vector Graphics), WAP (Wireless Access Protocol), and many other formats (for a complete description, see http://xml.apache.org/cocoon). Cocoon enables developers to create complex Web applications entirely based on XML and related technologies. Web site creators and developers are allowed to think in XML from A to Z.

Cocoon sites are configured in an XML file called sitemap.xmap. Most important, this file defines the processing pipelines Cocoon uses to process HTTP requests. Pipelines may be arbitrarily complex, using any mixture of static and dynamic resources. Basically every processing step in the pipeline is supposed to produce an SAX stream, which is consumed by the next step.

To use eXist, no changes to Cocoon itself are required. Beginning with version 2.0, Cocoon supports pseudo protocols, which allow the registration of handlers for special URLs via source factories. Current Cocoon distributions include a source factory to access XML:DB-enabled databases. Once the handler has been registered, it is possible to use any valid XML:DB URI wherever Cocoon expects a URL in its site configuration file.

As a practical example, suppose that you have written a small bibliography for an article. The bibliography has been coded in RDF (Resource Description Framework), using Dublin Core for common fields like title, creator, and so on. Additionally, you have written an XSLT style sheet to transform the data into HTML for display. The source file (xml_books.xml) as well as the XSLT stylesheet (bib2html.xsl) have been stored in eXist in a collection called "/db/bibliography".

You would like to get the file formatted in HTML if you access it in a browser. Adding the snippet shown in Listing 3.4 to the <map:pipelines> section of Cocoon's sitemap.xmap will do the job:

Listing 3.4 Additions to Cocoon's sitemap.xmap to Obtain HTML Output

<map:pipeline>
 <!"  . . .  more definitions here  . . .  ">
 <map:match pattern=/xmldb/db/bibliography/*.html">
    <map:generate src="xmldb:exist:///db/bibliography/{1}.xml"/>
    <map:transform src="xmldb:exist:///db/bibliography/bib2html.xsl"/>
    <map:serialize type="html"/>
 </map:match>
 <!"  . . .  ">
</map:pipeline>

If you browse to any location matching the pattern "xmldb/db/bibliography/*.html" relative to Cocoon's root path (e.g., http://localhost:8080/exist/xmldb/db/bibliography/xml_books.html), you will get a properly formatted HTML display. Instead of loading the files from the file system, Cocoon will retrieve the XML source and the style sheet from eXist.

You may also like to add a little search interface to be able to find entries by author, title, or date. Cocoon offers XSP to write XML-based dynamic Web pages. Similar to JSP, XSP embeds Java code in XML pages. To better support the separation of content and programming logic, XSP also enables you to put reusable code into logic sheets, which correspond to tag libraries in JSP. A logic sheet should help limit to a minimum the amount of Java code used inside an XSP page.

eXist includes a logic sheet based on the XML:DB API, which defines tags for all important tasks. You could also write all the XML:DB-related code by hand, but using the predefined tags usually makes the XML file more readable and helps users without Java knowledge understand what's going on. To give the reader an idea of how the XSP code might look, Listing 3.5 shows a simple example.

Listing 3.5 Using the XML:DB XSP Logic Sheet

<xsp:page
      xmlns:xsp=http://apache.org/xsp
      xmlns:xmldb="http://exist-db.org/xmldb/1.0"
>
      <html>
         <body>
            <h1>Process query</h1>
              <xmldb:collection uri="xmldb:exist:///db/bibliography">
                <xmldb:execute>
                  <xmldb:xpath>"document(*)//rdf: Description[dc:title"
                   + "&amp;="" + request.getParameter("title") + ""]"
                   </xmldb:xpath>
                   <p>Found <xmldb:get-hit-count/> hits.</p>

                   <xmldb:results>
                      <pre>
                        <xmldb:get-xml as="string"/>
                      </pre>
                   </xmldb:results>
                </xmldb:execute>
              </xmldb:collection>
         </body>
      </html>
</xsp:page>

The Cocoon version included with eXist has been configured (in cocoon.xconf) to recognize the xmldb namespace and associate it with the XML:DB logic sheet. All you have to do is to include the correct namespace declaration in your page (xmlns:xmldb="http://exist-db.org/xmldb/1.0").

The page executes a query on the dc:title element using the HTTP request parameter "title" to get the keywords entered by the user. As required by the XML:DB API, any action has to be enclosed in an xmldb:collection element. The query is specified in the xmldb:xpath tag using a Java expression, which inserts the request parameter into an XPath query. The xmldb:results element will iterate through the generated result set, inserting each resource into the page by calling xmldb:get-xml.

Note that in this simple example, all resources are converted to string to display the results in a browser. If you wanted to post-process results?for example, by applying a style sheet to the generated output?you would use <xmldb:get-xml as="xml"/>.

To tell Cocoon how to process this page, you finally have to insert a new pattern into the sitemap, as shown in Listing 3.6.

Listing 3.6 Processing Pattern to Be Inserted into Cocoon Sitemap

<map:match pattern="bibquery.xsp">
  <map:generate type="serverpages" src="bibquery.xsp"/>
  <map:serialize type="html"/>
</map:match>

To see if the page works, you may now enter into your Web browser the URL http://localhost:8080/exist/bibquery.xsp?title=computer.

Top

Part III: XML and Relational Databases

Part IV: Applications of XML

Part V: Performance and Benchmarks

References

Contributors