6.4 Using XSU for Fine-Grained Storage

  Previous section   Next section

Handling XML documents in a fine-grained manner is supported by Oracle's XSU (XML SQL Utility). XSU is a programmatic interface for Java and PL/SQL programs. XSU supports two directions:

  1. XSU transforms data from object-relational tables or views into XML; XML documents are generated out of relational data.

  2. Splitting XML documents into pieces and storing them in tables the other way around.

6.4.1 Canonical Mapping

The basis for both directions is a schematic mapping from relational structures to XML. Given an SQL query related to any tables or views, an XML document is produced:

  • The whole query result is enclosed by <ROWSET> . . . <s/ROWSET>

  • Each tuple of the result is put in <ROW> . . . </ROW>

  • Attribute values turn into tags <Name> . . . </Name> taking the column name

If aliases are used in SQL queries, the aliases are taken as tags instead of column names. Thus, an alias is useful to give expressions such as COUNT(*) a readable name.

These basic mechanisms produce XML documents. This allows for only simple XML documents having a limited nesting level?that is, <ROWSET>, <ROW>, <Attributename>. But the mapping is also able to handle object-oriented concepts in the following way:

  • Object types: If an attribute is object-valued, then the object type's attributes become inner tags according to the internal structure.

  • Collections: The name of a V Array or a nested table type becomes a tag to enclose the collection.

    The elements of the inner collection use the collection tag and append _ITEM.

  • References: A type tag with an object identifier uses the syntax <Tag REFTYPE="REF T"> 0ABD1F6. . . </Tag>.

Furthermore, it is worth mentioning that attributes or aliases starting with an at sign (@) have a special meaning: The values are taken for tag attributes instead of tag values.

The example in Listing 6.11 assumes two tables and three user-defined types for keeping customer information and their phone numbers.

Listing 6.11 Customers and Phone Numbers
CREATE TABLE Customer (CNo INT, FirstName VARCHAR, LastName
                  VARCHAR, Zip VARCHAR, City VARCHAR, Street
                  VARCHAR, Houseno VARCHAR);
                                    Street VARCHAR, Houseno VARCHAR);

The next query (see Listing 6.12) produces a nested structure, constructing objects of type Name_Type with first name and last name, building an Address_Type object, and putting the customer's phones in a Phone_tab.

Listing 6.12 Nested Structure Query
       Name_Type(c.FirstName,c.LastName) AS Name,
       Address_Type(c.Zip,c.City,c.Street,c.Houseno) AS Address,
       CAST (MULTISET (SELECT Phone FROM Phonelist WHERE CNo=c.CNo)
                       AS Phone_Tab) AS Phones
FROM Customer c

The query result possesses the following structure:

(Cno int, Name Name_Type, Address Address_Type, Phones Phone_Tab)

whereby Phone_Tab represents an inner table?that is, each value is a list of phone numbers. Each user-defined object type possesses a constructor like Address_Type to create objects; the parameters correspond to the internal type structure. CAST/MULTISET converts a SELECT-FROM-WHERE query into a collection, here of type Phone_Tab. That is, we obtain a "nested result table" having an inner table of phone numbers for each customer.

Processing the query in interactive SQL produces an XML document with a complex nested structure, as shown in Listing 6.13.

Listing 6.13 Query Output
<?xml version="1.0"?>
   <ROW NUM="1" CNO="10">
         <FIRSTNAME> Lucky </FIRSTNAME>
         <LASTNAME> Luke </LASTNAME>
         <ZIP> 12345 </ZIP>
         <CITY> Bull </CITY>
         <STREET> Cows Xing </STREET>
         <HOUSENO> 8 </HOUSENO>
         <PHONES_ITEM> 012/3456 </PHONES_ITEM>
         <PHONES_ITEM> 023/4567 </PHONES_ITEM>
   <ROW NUM="2" CNO="20">
   . . .

Each tuple is put into <ROW> . . . </ROW> and automatically numbered by an attribute NUM. The attributes of an object type are taken as tags?for example, FIRSTNAME and LASTNAME according to the inner structure of the Name attribute. The XML representation of a nested table uses a PHONES tag for the whole collection, and PHONES_ITEM for each entry.

Since the column CNo is renamed to @CNo, the value is taken as an attribute value CNO="10" in the XML document.

Similarly, an object view can be defined as shown in Listing 6.14.

Listing 6.14 Create a View
( CNo     INTEGER,
  Name    Name_Type,
  Addr    Address_Type,
  Phones  Phone_List
CREATE VIEW MyCustomer OF Customer_Type AS
SELECT . . .  query from above . . . ;

The object type Customer_Type describes the structure of the query that is used to define the object view. Then, SELECT * FROM MyCustomer would return the same XML document. The mapping is then based upon the structure of the object type Customer_Type.

Hence, the approach gains power pretty much from object-relational features and object view mechanisms: Applying object-relational concepts allows for nesting of tags to any depth according to the complex structure. Particularly, the object-relational concepts provide a powerful means to obtain nested XML structures even in case of 1NF tables.

Oracle's XSU defines a Java interface and two packages in the proprietary PL/SQL language to use XSU in programs:

  1. OracleXMLQuery and DBMS_XMLQuery, respectively

  2. OracleXMLSave and DBMS_XMLSave, respectively

In the following sections, only the handling in Java is presented. The principles of the PL/SQL packages are similar.

6.4.2 Retrieval

Listing 6.15 shows how to use XSU for executing a query in a Java environment and producing XML.

Listing 6.15 XSU Example
Connection conn = //JDBC database connection
"scott", "tiger");
OracleXMLQuery q = new OracleXMLQuery(conn,
"SELECT . . . FROM  . . .  WHERE  . . . ");
org.w3c.dom.Document domDoc =q.getXMLDOM();  // DOM representation
XMLDocument xmlDoc = (XMLDocument)domDoc;
StringWriter s = new StringWriter(10000);
xmlDoc.print(new PrintWriter(s));

The JDBC database connection specifies the URL of the database and the schema "scott" to be queried. The Java class OracleXMLQuery possesses two constructors that take an SQL query either as a string value or as a JDBC ResultSet:

  • OracleXMLQuery(Connection conn, String query)

  • OracleXMLQuery(Connection conn, ResultSet rset)

The resulting document can be obtained in different forms by various get methods:

  • java.lang.String getXMLString([int metaType])

  • java.lang.String getXMLString(org.w3c.dom.Node root [ ,int metaType])

  • org.w3c.dom.Document getXMLDOC([ int metaType ])

  • org.w3c.dom.Document getXMLDOC(org.w3c.dom.Node root [ ,int metaType])

  • void getXMLSAX(org.xml.sax.ContentHandler sax)

getXMLString creates a string representation as output. A variant of getXMLString uses a node-parameter that becomes the root of the resulting document. An optional int-parameter metaType specifies whether a DTD or a XML Schema is to be created for the document. The constants for metaType are DTD, SCHEMA, and NONE (the default).

The getXMLDOC method possesses the same parameters but returns the resulting document in a DOM representation. The code in Listing 6.15 passes the Document object to a PrintWriter for output. Using the DOM representation, the complete functionality of the Oracle XML Development Kit (XDK) can be used then, for example, to parse the document, to perform XSLT transformations, to extract parts of the document, to create a DTD, and so on.

A further method getXMLSAX assigns a SAX-ContentHandler to the document. Hence, the document can be handled in a SAX-conforming manner. The parameter sax must be previously registered.

OracleXMLQuery has various methods for processing results incrementally:

  • void setMaxRows(int max): Processes a certain amount of tuples

  • void setSkipRows(int n): Skips n tuples (being already read)

  • restart(): Executes the query again (for the next increment)

  • long getNumRowsProcessed(): Returns the total number of processed tuples

  • void keepObjectOpen(boolean): Lets the query (i.e., the underlying ResultSet) remain open for the database session

These methods are useful to convert a query result piecewise into several XML documents: Setting the bulk to n by means of setMaxRows, a loop skips i*n tuples in the i-th run before restarting the query.

Further methods of OracleXMLQuery enable one to rename the predefined tags:

  • void setRowTag(java.lang.String tagname): Renames the ROW tag.

  • void setRowsetTag(java.lang.String tagname): Renames the ROWSET tag.

  • void setRowldAttrName(java.lang.String tagname): Renames the NUM attribute of the ROW tag.

  • void setRowIdAttrValue(java.lang.String tagname): Determines an attribute the values of which are taken for the NUM attribute; passing the null object, the rows are sequentially numbered starting with 1.

  • void useTypeForCollElemTag(boolean): Tags the entries of the inner collection with the type name of the collection; by default _ITEM is attached to the collection tag.

  • void useUpperCaseTagNames(): Puts tag names in uppercase.

  • void useLowerCaseTagNames(): Puts tag names in lowercase.

However, the best flexibility is obtained by transforming documents with the help of style sheets. XSL (eXtensible Stylesheet Language) is a powerful language to describe transformations. Particularly, the sublanguage XSLT (XSL Transformations) allows one to transform XML into another text-based format. XSLT is thus ideal to perform additional transformations on the resulting XML document. In XSU, a style sheet is registered as follows by passing a Java String or Reader object:

  • void setXSLT(java.lang.String stylesheet)

  • void setXSLT(Reader stylesheet)

Additional methods are not explained here in detail. For instance:

  • void setStylesheetHeader(java.lang.String uri [, java.lang. String type ]): Sets the style sheet header in the document

  • org.w3c.dom.Document getXMLSchema(): Creates an XML schema for the result

  • String getXMLMetaData(int metaType, boolean mitVersion): Generates a DTD or an XML schema as a string

6.4.3 Modifications

The other way around, XML documents that conform to the canonical mapping can be stored in a relational table or in a view in a fine-grained manner. Using views, several tables beneath the view definition can be filled. Otherwise, only the storage in one table is possible.

The piece of code in Listing 6.16 shows the principle. Assume the URL of an XML file is given.

Listing 6.16 Using Views
OracleXMLSave sav = new OracleXMLSave(conn, "MyCustomer");

The method getURL(String) of class OracleXMLSave is useful for processing files by means of a URL: Given a string that contains a filename or a URL, getURL creates a java.net.URL object, which can be used for insertXML. The content of the file is stored in the view MyCustomer.

The referenced XML document must certainly have a structure that fits to the tables in order to store the derived tuples. One exception to the rule is that tags in the document may be omitted; tags that do not occur are stored as NULL values.

The constructor of OracleXMLSave takes a table or view that is to be filled. Views are useful if XML documents are to be spread across several tables; the view abstracts from the underlying tables, thus defining a mapping that breaks down documents into pieces. The mechanism of view update is used thereby. In general, updating views requires an INSTEAD OF trigger that defines the effect of INSERT, UPDATE, and DELETE on the underlying base tables.

OracleXMLSave possesses three main methods for manipulation:

  • int insertXML(org.w3c.dom.Document)

  • int updateXML(org.w3c.dom.Document)

  • int deleteXML(org.w3c.dom.Document)

The methods are also available with alternative parameters of type java.lang.String, Reader, InputStream, and java.net.URL, each representing the XML document. All the methods return the number of modified tuples.

In case of updateXML and deleteXML, the given document is taken as a pattern for qualifying documents?that is, the document determines the query values of the resulting WHERE condition. Which attributes are really participating in the condition are specified separately by means of the method voidsetKeyColumnList(String[] ). Several attribute names can be passed as a String array. The corresponding tag values are extracted and used in the condition.

The document also defines the values to be changed or stored in case of updateXML and insertXML. That is, the document contains the new values being used for UPDATE . . . SET (updateXML) and INSERT INTO . . . VALUES (insertXML). The attributes to be modified are specified in a method void setUpdateColumnList(String[] ). See Listing 6.17 for an example.

Listing 6.17 insertXML Example
OracleXMLSave sav = new OracleXMLSave(conn, "MyCustomer");
String[] list = new String[2];
list[0] = "CNo";        // insert only CNo and
list[1] = "Name";       // Name in view MyCustomer
Document doc = sav.getUrl("myUrl");
sav.insertXML(doc);     // extract CNo and Name and insert them

This program performs the following SQL statement:

INSERT INTO MyCustomer (CNo, Name)
VALUES (:cno, :n)

The host variables :cno and :n are the CNo- and Name values extracted from the XML document. An update is shown in Listing 6.18.

Listing 6.18 updateXML Example
list[0] = "Name";             // modify Name and Address
list[1] = "Address";          // in view MyCustomer
sav.updateXML(doc);          // all entries having the
                             //CNo-value are changed

Hence, the effect of updateXML is shown in Listing 6.19.

Listing 6.19 Effect of updateXML
UPDATE MyCustomer
SET Name    = :name,    /* from the document */
    Address = :address  /* from the document */
WHERE CNo = :cno        /* from the document */

Further methods allow for bulk operations and determine the Commit behavior:

  • void setBatchSize(int n): Helps reduce client/server communication?n operations are collected and transmitted as a bulk operation to the database server.

  • void setCommitBatch(int n): Defines simple transactional behavior?a commit is performed after every n operations.

Again, tags can be renamed by using the set methods such as setRowTag. Furthermore, a style sheet can be registered with setXSLT; it is executed before the document is written to the database. This is another effective manner to affect the storage?that is, use XSLT to transfer an XML document into a suitable form that can then be handled by the canonical mapping.


Part IV: Applications of XML