Chapter 8. Partitioning for Manageability

Back in Chapter 3, two critical points were made: Fact tables are so large that your success will largely hinge on their implementation, and fact tables should be partitioned to improve their manageability, not for quicker end-user ad-hoc queries or faster data load times.

I find that most DBAs eagerly partition their facts since it seems intuitively obvious that anything that large should be partitioned. However, they often partition their facts for the wrong reasons, and sometimes using the wrong, or even the worst, partitioning criteria. About half do so to improve query response times. The belief is that the best queries are those done in parallel against partitioned and sub-partitioned tables. But the fact is (as explained back in Chapter 5) that obtaining the star transformation explain plan is the most critical aspect for ad-hoc queries. In fact, it is so important that it does not really matter whether the table is partitioned or not. Yes, queries will run faster against partitioned fact tables, but 98% of the query speed will be from achieving the correct star transformation explain plan. Partitioning will simply be "icing on the cake" in terms of speeding up the end-users' ad-hoc queries.

Therefore, DBAs typically partition for the wrong reasons, and possibly sub-optimally as well. For example, consider the DBA who partitions a fact along a non-time dimension as many queries reference that criteria. In this case, the DBA is actually partitioning along a candidate for aggregation in the mistaken belief that it will speed up queries. Don't confuse partitioning and aggregation; they serve very different purposes.

Other DBAs partition their facts to improve data loading batch cycles. But remember, adding rows to a table really is not dependent on the size of that table. You can add a million rows to a billion-row table just about as fast as to an empty one, especially using the APPEND hint to achieve direct mode loads. The real time savings here is that you can drop and recreate indexes on the affected partitions rather than for the entire table. This is really a divide and conquer technique for managing very long-running and costly operations. In other words, these DBAs are actually and unknowingly doing the right thing for the wrong reasons. But typically, they do get the partitioning criteria correct.

One last issue that often muddies the water regarding fact table partitioning is this: DBAs may define their partitioning schema under Oracle 8i (with its numerous limitations), and then not revisit that design when upgrading to Oracle 9i. For example, Oracle 8i does not offer list partitioning and only permits parallel operations across partitions, not within them. However, Oracle 9i offers list partitioning and permits parallel operations both across and within partitions. The point is that when either selecting an Oracle version or upgrading, you should re-evaluate the current partitioning design so as to best leverage what's now available. Otherwise, you may well end up with a partitioning scheme that is sub-optimal for your Oracle version.