The number of XML documents will grow rapidly in the future due to the increasing importance of XML as a data exchange format and as a language describing structured text. Serious thought is required on how to store XML documents while preserving their structure and allowing efficient access to parts of the structured documents. The latter calls for database techniques that primarily view XML documents as semi-structured data.
There are many standard database systems?relational, object-oriented, object-relational, as well as directory servers?and more recently the so-called native XML database systems. We would like to determine the suitability of these alternatives for storing XML documents. In this chapter we show the results of an intensive comparison of the time and space consumed by the different database systems when storing and extracting XML documents. In a suite of benchmark tests, we stored different-sized XML documents using different kinds of database management systems, extracted complete documents, selected fragments of documents, and measured the performance and disk space used. Our data-respective object models for the standard database approaches also used the Document Object Model, which maps XML documents into trees and thus reduces the problem to storing and extracting trees (i.e., hierarchical structures). We used the pure standard database techniques without any extensions in order to demonstrate their true capabilities in solving this task.
Top |