19.5 Test Results

  Previous section   Next section

The benchmarking tests were run on an Intel Pentium III server with 450MHz, 256MB RAM, and two mirrored UW-SCSI hard disks. The clients included an Intel Pentium III processor with 350MHz and 128MB RAM. They were connected to the server by a 10-Mbit local area network. The storing strategies used a well-known relational DBMS, object-oriented DBMS, directory server, and native XML DBMS (for legal reasons the products cannot be named).

The documents used were automatically generated based on a DTD that defines the structure of project descriptions. The DTD contains 26 elements with a maximum nesting size of 8 and 4 attributes. XML documents based on this DTD contain information on the project members, such as names and addresses, as well as information on the publications, such as articles. This allows us to easily produce different-sized documents.

To compare the different database types, we run a set of tests that do the following:

  • Store XML documents

  • Extract complete XML documents

  • Delete complete XML documents

  • Extract parts of documents identified by the position of elements in the document

  • Replace parts of documents

The benchmarks were specified and run by Fellhauer (see Fellhauer 2001 for the complete list of test results).

19.5.1 Evaluation of Performance

We ran the benchmarks five times with every document of every size, dropped the best and the worst results, and took the average of the remaining three values as the result value. Table 19.7 shows the results of storing XML documents.

All figures in Table 19.7 measure the times for inserting the XML document into the database including the time consumed by the DOM parser. The object-oriented database is the best. The native XML database shows a relatively high growth rate, which we could not confirm for larger documents because of the limited space of our test installation?an 11MB document took 35MB disk space of the 50MB space of the test version. Almost surprising is the bad result of storing the 25MB document into the typed relational database: It took more than 12 hours. We would like to investigate this further to determine if the number of tables involved is the reason. We were not able to store the 64MB document in the databases due to the 256MB main memory. The DOM tree could be built, but the traversal of the tree resulted in permanent swapping. Table 19.8 shows the results of extracting complete XML documents.

[*] Failed because of license restrictions.

The native XML database shows the best results, better than the directory server, which is known for its fast read access. It is not surprising that the relational databases consume a lot of time for reconstructing the documents. We do not know whether the size of the unique element table in the nontyped relational database will produce a result worse than the typed relational database result. The number of tables could influence the results for the 8MB document, especially when we apply proprietary statements for selecting hierarchically connected entries.

To extract parts of the XML documents, we ran queries that determine elements in the XML document and return these elements or parts of their content. All databases show similar results independent of the size of the document; there is no sequential search. The relational databases are the fastest, where the difference between the nontyped and the typed approaches is due to the runtime of consulting the DTD. Table 19.9 shows the test results for the query "Select the first heading of the third publication of a project determined by a project number."

The nontyped relational database shows surprisingly good results especially for the selection of small document parts. It leaves the directory servers, which are known for fast read accesses, far behind. The poor results of the object database are caused by the reconstruction and searching of the DOM tree.

Table 19.8. Test Results for Extracting Complete XML Documents

Size of XML Documents

Directory Server

Non-Typed Relational Database

Typed Relational Database

Object-Oriented Database

Native XML Database

125 KB

11.8 s

26.9 s

31.0 s

12.9 s

8.7 s

500 KB

22.2 s

81.1 s

75.9 s

26.5 s

9.5 s

2,000 KB

39.1 s

307.4 s

275.6 s

49.9 s

12.7 s

8,000 KB

153.4 s

2,369.7 s

1,620.3 s

175.2 s

28.7 s

16,000 KB

206.7 s

-

-

232.2 s

-

32,000 KB

413.4 s

-

-

904.2 s

-

Table 19.9. Test Results for Extracting Parts of XML Documents

Size of XML Documents

Directory Server

Non-Typed Relational Database

Typed Relational Database

Object-Oriented Database

Native XML Database

125 KB

3.7 s

0.2 s

3.4 s

12.9 s

8.9 s

500 KB

3.6 s

0.2 s

3.6 s

24.9 s

9.5 s

2,000 KB

3.7 s

0.2 s

3.5 s

45.4 s

12.4 s

8,000 KB

3.8 s

0.2 s

3.6 s

154.4 s

235.1 s

16,000 KB

3.6 s

0.2 s

3.5 s

199.7 s

-

32,000 KB

3.6 s

-

-

396.7 s

-

Finally, Table 19.10 shows the results of updating parts of an XML document?for example, the whole person element determined by the last name and the project number should be replaced by a new person element given as are XML document.

The native XML database shows dramatically decreasing performance. The poor results of the object database are due to the bad performance of the search functions applied to the persistent DOM tree.

Table 19.10. Test Results for Replacing Parts of an XML Document

Size of XML Documents

Directory Server

Non-Typed Relational Database

Typed Relational Database

Object-Oriented Database

Native XML Database

125 KB

17.8 s

7.3 s

5.9 s

809.0 s

19.6 s

500 KB

17.3 s

7.2 s

5.8 s

798.9 s

52.0 s

2,000 KB

17.2 s

7.3 s

5.8 s

798.7 s

195.9 s

8,000 KB

17.2 s

7.2 s

5.8 s

794.4 s

198.9 s

16,000 KB

17.1 s

7.4 s

5.6 s

796.7 s

692.3 s

32,000 KB

16.9 s

-

-

795.8 s

-

19.5.2 Evaluation of Space

Table 19.11 shows the disk space each database management system uses to store the schema and the XML documents of different sizes. The typed relational database approach defines a table for each element type, which increases space for the schema. However, disk space to store the XML documents is very efficient.

The directory server and the native XML database produce a lot of overhead, when compared to the original XML document.

19.5.3 Conclusion

The benchmarks have shown that the nontyped relational database approach has advantages over all other solutions. The weak point is the reconstruction of complete XML documents, which should be improved. As long as a standardized XML query language does not support inserting and updating functionality, the reconstruction of XML documents will be an important operation.

We do not know whether the bad results of the searching function of the object-oriented database system are representative for this database type. But it supports our belief that searching large object trees will cause large loading times due to the techniques the systems apply. To avoid this, special indexing techniques like B-trees have to be applied. Although content management systems based on object-oriented databases implement this improvement, we used the bare object-oriented database approach to show its capabilities.

Table 19.11. Disk Space Usage in Kilobytes

Size of XML Documents

Directory Server

Non-Typed Relational Database

Typed Relational Database

Object-Oriented Database

Native XML Database

Schema/0 KB

440

672

3,484

512

5,178

125 KB

2,064

117

118

615

760

500 KB

4,952

820

546

1,536

2,088

2,000 KB

11,240

3,164

2,182

3,738

7,060

8,000 KB

40,848

15,157

10,547

13,722

28,780

16,000 KB

53,280

26,562

18,476

22,630

> 50,000[*]

32,000 KB

120,024

-

-

43,991

-

64,000 KB

-

-

-

-

-

[*] Space restriction of 50 MB.

The directory server shows disappointing results compared to the relational database. The expected very fast reading times could not be achieved. There might be some improvements in the mapping of the DOM tree into the directory information tree, too. Also the space consumed by the directory server is critical. Additional experiments will be necessary to determine the reasons.


Top

Part IV: Applications of XML