20.6 Experiment Result

  Previous section   Next section

Table 20.1 provides a summary of the tests we used in our performance evaluation. We discuss our results following the table.

20.6.1 Database Size

Database size is the disk space required for storing data after the conversion procedures in the database server. Index size is also investigated. The unit of measurement is megabytes (MBs).

As shown in Figure 20.4, the native XML database needs more disk space to store both data and indexes than the XML-enabled database. The growth is almost exponential. The result is more serious as the number of records increases. For 50,000 records, the native XML database has an indexing size over 150 times that of the XML-enabled database. The XML-enabled database controls the sizing much better than the native XML database. From 100 to 10,000 records, the indexing size is approximately 0.11 MB. The larger indexing in the native XML database can be explained by the fact that more comprehensive indexing support is provided, such as full-text searching. Storing XML in the native XML database is not any less space-efficient than decomposing the same data and storing it in a relational database. The only reason for this large size is that the native XML database must store tag names, both elements and attributes, in order to retain the native XML features of the document source.

Figure 20.4. Results of Database Size

graphics/20fig04.gif

20.6.2 SQL Operations (Single Record)

The objective of Q1 and Q2 is to measure insert performance (see Figure 20.5). Q1 is a single insert statement with only one table (item) involved. Q2 consists of a master-details relationship (customer and customer_address). As the results indicate, the XML-enabled database has better performance than the native XML database in all cases. However, both products have steady figures no matter how large the database is. We conclude that insert operation performance is not affected by the database size. Furthermore, Q2 costs more time than Q1 as Q2 needs to handle more than one table.

Figure 20.5. Results of Q1 and Q2?Insert

graphics/20fig05.gif

The objective of Q3 and Q4 is to measure update performance (see Figure 20.6). We obtained results similar to the insert operation (shown in Figure 20.5). The XML-enabled database has better performance than the native XML database. The update timing is less than the cost of insert. This is reasonable as the insert operation needs to check data integrity and unique indexing before execution. Before an update/delete operation, the record has already been retrieved from the database. Hence, time is saved. Both products have steady figures no matter how large the database is. We conclude that update performance is not affected by the database size.

Figure 20.6. Results of Q3 and Q4?Update

graphics/20fig06.gif

The objective of Q5 and Q6 is to measure delete performance (see Figure 20.7). We obtain results similar to Q1?Q4. The XML-enabled database has better performance than the native XML database. Since the update and the delete functionalities are of the same logic, there is not much performance difference between update and delete. Both products have steady figures no matter how large the database is. We conclude that delete performance is not affected by the database size.

Figure 20.7. Results of Q5 and Q6?Delete

graphics/20fig07.gif

Q7, Q8, and Q9 measure the time to search for a record using an index key (see Figure 20.8). For 100 or 1,000 records, the results for both products are very similar. From 5,000 records onward, the native XML database outperforms the XML-enabled database. The native XML database provides steady performance in all cases. We conclude that the native storage strategy and indexing approach is efficient enough for searching in a database.

Figure 20.8. Results for Q7, Q8, and Q9?Searching

graphics/20fig08.gif

For this section, we conclude that the XML-enabled database outperforms the native XML database in single SQL operations, but the native XML database outperforms the XML-enabled database in index searching. It seems that the XML parser in either database has no impact on our performance results.

20.6.3 SQL Operations (Mass Records)

Q10 and Q11 are the bulk load operations for Item and Customer records (see Figure 20.9). As the figures indicate, the native XML database has better performance than the XML-enabled database as data size becomes larger. The XML-enabled database runs faster than the native XML database for 100 and 1,000 records. For larger record numbers, the native XML database costs at most half of the running time as the XML-enabled database. This may be due to the native XML database's storage strategy. The API gateways of the XML-enabled database could be the bottleneck for larger data sizes. Furthermore, the running time for Customer records is more than the running time for Item records as size becomes larger.

Figure 20.9. Results for Q10 and Q11?Bulk Load

graphics/20fig09.gif

Q12 and Q13 are the mass delete operations for Item and Customer records (see Figure 20.10). As the figures indicate, the XML-enabled database has better performance than the native XML database except for 50,000 records. For the XML-enabled database, a simple structural and powerful SQL query can perform a mass delete. In contrast, the servlet program for the native XML database needs to execute an additional query prior to retrieving all possible Customers/Items. Then the program uses the temporary list to remove records.

Figure 20.10. Results for Q12 and Q13?Mass Delete

graphics/20fig10.gif

Q14 and Q15 are the mass update process for Item and Customer records (see Figure 20.11). As the figures indicate, the XML-enabled database has better performance than the native XML database except in the case of 50,000 records. We conclude that this is due to the same reasons outlined above for Figure 20.10.

Figure 20.11. Results for Q14 and Q15?Mass Update

graphics/20fig11.gif

20.6.4 Reporting

The following are the results from the reporting section of the sample application. Similar results are measured for Q16 and Q17 (see Figure 20.12). The native XML database has steady performance in implementing regular expressions. The XML-enabled database results are more variable.

Figure 20.12. Results for Q16 and Q17?Reporting

graphics/20fig12.gif

Q18 aims to measure the join property (one-to-many relationship) of records/documents inside invoice and invoice_item. The results in Figure 20.13 are similar to Q5. For rich-content documents or records, the native XML database outperforms the XML-enabled database. This result is clear from 10,000 records. In this case, the native XML database cannot give a steady result. The running time goes up as data size increases. But it is not as serious as the XML-enabled database. Q19 combines selection criteria and sorting. The result is similar to Q4. This time the XML-enabled database provides steady performance within the testing range. It outperforms the native XML database until at 10,000 records, and even at 50,000 records, the two products are quite similar.

Figure 20.13. Results for Q18 and Q19

graphics/20fig13.gif

We conclude that the native XML database has better query optimization than the XML-enabled database for large data sizes. However, the XML-enabled database does dominate for small data sizes.


Top

Part IV: Applications of XML