The natural question is which of these many techniques is best? Well, that all depends. You must weigh the options with regard to the nature of your data. What works best in one case may not work at all in another. That said, here's what I found on the 7-Eleven data warehouse (shown in Table 8-1).
Fact Implementation | Timing |
---|---|
Non-Partitioned Table | 9,293 |
Range Partitioned Table | 4,747 |
Multi-Column Range Partitioned Table | 4,987 |
Range-Hash Partitioned Table | 6,319 |
Range-List Partitioned Table | 4,820 |
Non-Partitioned IOT[1] | 12,508 |
Range Partitioned IOT | 14,902 |
[1] IOT stands for index organized table. This is a table in Oracle where both the table and its index are created and stored together as a single data structure. This can provide quicker access for tables that are fully indexed (i.e. tables where the index contains a majority or large percentage of the available columns).
From these results, we see that simple partitioning gave the best results. But, let me reiterate that these results are specific to a particular data warehouse's data and the nature of the end-users' queries. You should perform similar benchmarks against your data to be absolutely sure. Remember that what often looks good on paper may well under-perform in reality. So don't go into this with any preconceived favorites or other prejudices. Let the chips fall where they will, and implement the choice that works best for your data.
When in doubt, or if you don't have the time to benchmark, just go with simple range-based partitioning along a time dimension. In most cases, range partitioning will be a safe and near optimal choice.