18.1 Introduction

With the increasing popularity of the XML as a representation format for a wide variety of data, it is clear that large repositories of XML data sets will soon emerge. The effective management of XML in a database thus becomes a pressing issue. Several methods for managing XML databases have emerged, ranging from retrofitting commercial RDBMSs to building native XML database systems. There has naturally been an interest in benchmarking the performance of these systems, and a number of benchmarks have been proposed (Böhme and Rahm 2001; Bressan, Dobbie 2001; Schmidt, Waas, Kersten, Florescu, Manolescu et al. 2001). The focus of currently proposed benchmarks is to assess the performance of a given XML database in performing a variety of representative tasks. Such benchmarks are valuable to potential users of a database system in providing an indication of the perfor-mance that users can expect on their specific application. The challenge is to devise benchmarks that are sufficiently representative of the requirements of most users. The TPC series of benchmarks accomplished this, with reasonable success, for relational database systems. However, no benchmark has been successful in the realm of ORDBMS and OODBMS, which have extensibility and user-defined functions that lead to great heterogeneity in the nature of their use. It is too soon to say whether any of the current XML benchmarks will be successful in this respect?we certainly hope that they will.

One aspect that current XML benchmarks do not focus on is the performance of the basic query evaluation operations, such as selections, joins, and aggregations. A micro-benchmark that highlights the performance of these basic operations can be very helpful to a database developer in understanding and evaluating alternatives for implementing these basic operations. A number of questions related to performance may need to be answered: What are the strengths and weaknesses of specific access methods? Which areas should the developer focus attention on? What is the basis for choosing between two alternative implementations? Questions of this nature are central to well-engineered systems. Application-level benchmarks, by their nature, are unable to deal with these important issues in detail. For relational systems, the Wisconsin benchmark (DeWitt 1993) provided the database community with an invaluable engineering tool to assess the performance of individual operators and access methods. Inspired by the simplicity and the effectiveness of the Wisconsin benchmark for measuring and understanding the perfor-mance of relational DBMSs, we develop a comparable benchmarking tool for XML data management systems. The benchmark that we propose is called the Michigan benchmark.

A challenging issue in designing any benchmark is the choice of the data set that is used by the benchmark. If the data are specified to represent a particular "real application," they are likely to be quite uncharacteristic for other applications with different data distributions. Thus, holistic benchmarks can succeed only if they are able to find a real application with data characteristics that are reasonably representative for a large class of different applications.

For a micro-benchmark, the benchmark data set must be complex enough to incorporate data characteristics that are likely to have an impact on the performance of query operations. However, at the same time the benchmark data set must be simple so that it is not only easy to pose and understand queries against the data set, but the queries must also guide the benchmark user to the precise component of the system that is performing poorly. We attempt to achieve this balance by using a data set that has a simple schema. In addition, random number generators are used sparingly in generating the benchmark's data set. The Michigan benchmark uses random generators for only two attribute values and derives all other data parameters from these two generated values. In addition, as in the Wisconsin benchmark, we use appropriate attribute names to reflect the domain and distribution of the attribute values.

When designing benchmark data sets for relational systems, the primary data characteristics that are of interest are the distribution and domain of the attribute values and the cardinality of the relations. In addition, there may be a few additional secondary characteristics, such as clustering and tuple/attribute size. In XML databases, besides the distribution and domain of attribute values and cardinality, several other characteristics, such as tree fanout and tree depth, are related to the structure of XML documents and contribute to the rich structure of XML data. An XML benchmark must incorporate these additional features into the benchmark data and query set design. The Michigan benchmark achieves this by using a data set that incorporates these characteristics without introducing unnecessary complexity into the data set generation, and by carefully designing the benchmark queries that test the impact of these characteristics on individual query operations.

The remainder of this chapter is organized as follows. The next section presents related work. Following this, we discuss the rationale of the benchmark data set design. We then describe the queries of the benchmark data set. Then, we present our recommendation on how to analyze and present the results of the benchmark. Finally, we summarize the contribution of this benchmark.

Top