2.4 Querying XML

  Previous section   Next section

As already mentioned, Tamino XML Server supports methods to directly retrieve documents that were previously stored using Tamino (e.g., direct addressing of documents via URL). This includes documents with mapping to external data sources or to functions. However, just retrieving documents by their name or by their identifier is only part of the functionality needed in an XML database system. Powerful query facilities must complement the direct access methods.

2.4.1 Query Language?Tamino X-Query

Currently, there is no standardized XML query language. However, there is W3C activity to define an XML query language, called XQuery (Boag et al. 2002), but it is still in draft status. As a consequence, database systems that store XML have to use another language for querying. A common choice is XPath (Clark and DeRose 1999), which is also the basis for Tamino's query language. However, Tamino has extended XPath in two aspects:

  • While the navigation-based approach of XPath fits the needs of retrieval in data-centric environments, document-centric environments need a more content-based retrieval facility. Therefore, Tamino XML Server also supports full-text search over the contents of attributes and elements (including their children, ignoring markup). For this purpose, Tamino defines a new relational operator ~= (the contains operator). With the Tamino text search capabilities, the occurrence of a word or a phrase in the content of an element can be tested. Case is ignored in text search. An asterisk represents a wildcard for the search: //*[.~="XML" ADJ "database*"] finds all elements that contain strings such as "XML databases", "xml databases", "xml database systems". Text search functionality is independent of the existence of text indexes?they just accelerate the search.

  • To enable the user to integrate dedicated functionality into queries, Tamino allows user-defined functions to be added to the query language. This is another functionality of the X-Tension component.

With the progress of the W3C standardization efforts for XQuery, Tamino XML Server will also support this query language.

2.4.2 Sessions and Transactions

Tamino operations (including queries) can be executed inside or outside a session context. In the latter case, such an operation is a transaction on its own?that is, after its execution, all resources used by this operation are released. In case of an error, all effects of the operation have been wiped out; otherwise all effects of the operation are made persistent.

A session groups multiple operations together. The session is the unit of user authentication. In a connect request, the user presents his credentials and sets the default operation modes for all operations executed in the session. Among them is the isolation level for the session, which is analogous to the isolation levels in SQL: The higher the isolation level, the fewer anomalies can occur. On the other hand, the higher the isolation level, the higher the impact on parallel operations. The isolation level defined for a session can be overridden on the statement level. Within a session, multiple consecutive transactions can occur.

2.4.3 Handling of Results

The result of a Tamino query is well-formed XML. If a query returns more than one document (or document fragment), the pure concatenation of them would not yield well-formed XML because an XML document must have exactly one root. Therefore, Tamino wraps the result set in an artificial root element. This also contains context information such as error messages. To facilitate handling of large response sets, Tamino offers a cursor mechanism on query results. Cursor information is also included in the result wrapper. The result of fetching just one result element at a time for the query /City/Monument/Name looks like that shown in Listing 2.7.

Listing 2.7 Result of /City/Monument/Name Query
<?xml version="1.0" encoding="iso-8859-1" ?>
<ino:response xmlns:ino="HTTP://namespaces.softwareag.com/tamino/response2"
       xmlns:xql="HTTP://metalab.unc.edu/xql/" ino:sessionid="15"
       ino:sessionkey="18362">
  <xql:query>/City/Monument/Name</xql:query>
  <ino:message ino:returnvalue="0">
    <ino:messageline>fetching cursor</ino:messageline>
  </ino:message>
  <xql:result>
    <Name ino:id="3">Russian Chapel</Name>
  </xql:result>
  <ino:cursor ino:handle="1">
    <ino:current ino:position="3" ino:quantity="1" />
    <ino:next ino:position="4" />
    <ino:prev ino:position="2" />
  </ino:cursor>
  <ino:message ino:returnvalue="0">
    <ino:messageline>cursor fetched</ino:messageline>
  </ino:message>
</ino:response>

2.4.4 Query Execution

When a query is sent to Tamino XML Server, the first step is to transform it into Unicode. After this, the query is parsed and checked for syntactical correctness. In the following optimization step, the query is matched against the schema definitions. Depending on the existence of a schema, open or closed content definition, and existence of full or condensed structure index, some queries will return to the user with an empty result even before accessing data. The next step is the selection of appropriate indexes to use for the evaluation of the query. Index-based selection of documents is then performed, and the remaining parts of the query are evaluated on the result.


Top

Part IV: Applications of XML