Partition Option Benchmarks

The natural question is which of these many techniques is best? Well, that all depends. You must weigh the options with regard to the nature of your data. What works best in one case may not work at all in another. That said, here's what I found on the 7-Eleven data warehouse (shown in Table 8-1).

Table 8-1. Performance Charcteristics for Various Table Implementation Options
Fact Implementation	Timing
Non-Partitioned Table	9,293
Range Partitioned Table	4,747
Multi-Column Range Partitioned Table	4,987
Range-Hash Partitioned Table	6,319
Range-List Partitioned Table	4,820
Non-Partitioned IOT^[1]	12,508
Range Partitioned IOT	14,902

^[1] IOT stands for index organized table. This is a table in Oracle where both the table and its index are created and stored together as a single data structure. This can provide quicker access for tables that are fully indexed (i.e. tables where the index contains a majority or large percentage of the available columns).

From these results, we see that simple partitioning gave the best results. But, let me reiterate that these results are specific to a particular data warehouse's data and the nature of the end-users' queries. You should perform similar benchmarks against your data to be absolutely sure. Remember that what often looks good on paper may well under-perform in reality. So don't go into this with any preconceived favorites or other prejudices. Let the chips fall where they will, and implement the choice that works best for your data.

When in doubt, or if you don't have the time to benchmark, just go with simple range-based partitioning along a time dimension. In most cases, range partitioning will be a safe and near optimal choice.