HTML dominates the Internet as a standard to present data and information in the twenty-first century. XML is replacing HTML in many domains as the next-generation standard for data representation and exchange. As e-commerce development grows rapidly, XML documents are also growing in ever increasing numbers. Hence, many XML-related applications and software are available to satisfy market demands. In this chapter, we concentrate on an XML-enabled database and a native XML database by comparing their performance (for legal reasons the products cannot be named). The XML-enabled database is a relational database that transfers data between XML documents and relational tables. It retrieves data for maintaining the relational properties between tables and fields, rather than to model XML documents. The native XML database stores XML data directly. It maps the structure of XML documents to the database without any conversion.
The XML-enabled database, with a relational database engine, stores XML data in relations. The XML document schema must be translated into the relational schema before accessing the corresponding tables. Similarly, the XML query language must be translated into SQL to access the relations. Figure 20.1 shows the architecture of the XML-enabled database. The XML-QL (XML Query Language) is implemented as an add-on function to the RDBMS. The relational QE (query engine) parses the XML-QL and translates it into SQL. The SQL is executed inside the RDBMS to perform operations with the storage manager. The XML-QL must be implemented separately. Since XML data instances are stored in relational tables, translated SQL may involve retrieval and join operations, which may consume considerable CPU power and cause performance degradation.
The native XML database stores XML data directly. The database engine accesses the XML data without performing any conversion. This is the main difference between an XML-enabled database and a native XML database. This direct access in a native XML database can reduce processing time and provide better performance. The XML engine stores XML documents in and retrieves them from their respective data sources. The storage and retrieval are based on schemas defined by the administrator. The native XML database implements performance-enhancing technologies, such as compression and buffer pool management, and reduces the workload related to database administration. The diagram in Figure 20.2 shows the storing and retrieving of XML data through the XML engine. The XML parser checks the syntactical correctness of the schema and ensures the incoming XML data objects are well formed. The object processor is used to store objects in the native XML store. The query language is XML Query Language (XQL). The query interpreter resolves incoming requests and interacts with the object composer to retrieve XML objects according to the schemas defined by the administrator. Using the storage and retrieval schemas, the object composer constructs the information objects and returns them as XML documents.
The arrival of native XML databases has been fairly rapid. However, the maturity of the technology may be questionable. How well do such products perform? Can they really replace traditional (relational) database products? The goal of this chapter is to try to answer these questions. We will implement some performance tests to compare a modern XML-enabled database with a native XML database. Through various procedures and operations, we will measure and record the outcome. The data will then be analyzed, and we will then draw our conclusions.