Far too often, people have the expectation that using expensive hardware is the only way to obtain optimal performance from their data warehouse. They'll spend a lot of money to throw both hardware and software at their performance problems, including items such as:
Multi-CPU servers (symmetric multi-processing [SMP] or massively parallel processing [MPP])
RAID disk arrays (storage area network [SAN] or network-attached storage [NAS])
More disk array memory cache
Faster disk drives (e.g., 15,000 RPM)
More disks (i.e., switch RAID-5 to RAID-1+0)
 There are two common kinds of operating file systems: cooked and raw. With cooked file systems, the operating system manages access and operations on files and their contents. With raw file systems, the applications themselves do this work?bypassing the operating system file system.
Better file systems (e.g., Veritas with Quick IO option)
I've seen more money spent on hardware upgrades to solve performance problems in data warehousing than on any other item. One company with a data warehouse I visited actually switched both its UNIX server and disk array vendors in an attempt to solve its severe performance problems. Imagine their surprise when the problem did not go away with all that new hardware. Then imagine their utter surprise when it was fixable in a couple of hours merely by changing a few INIT.ORA parameters and redoing their table and index statistics collections!
In reality, the correct Oracle version, proper use of all its features, and the underlying database design are the most important factors for obtaining optimal performance for any successful data warehouse implementation. Of course, there are certain minimum hardware and software requirements that must be met. For example, I cannot imagine a multi-terabyte data warehouse on a PC. I also cannot envision a successful data warehouse on a mainframe?if it's using the wrong version of Oracle or fails to utilize Oracle's data warehousing-specific features.
The primary database feature requirements for a successful Oracle data warehouse are:
Reliable and efficient partitioning
Reliable and efficient bitmap indexes
Query explain plan support for star transformation access method
Reliable and efficient statistics for cost-based optimization
Reliable and efficient histograms for cost-based optimization
Reliable, efficient, and easy-to-use parallel query and data manipulation language (DML)
Let's see how the various Oracle versions measure up.
Oracle 7.X lacks all the key data warehousing feature requirements. You do not want to be on this version for any kind of serious data warehousing project. You will fail or have to upgrade once your data warehouse exceeds a few hundred GB. For example, a simple data warehouse query that ran over 13 hours under Oracle 7.3 ran in less than 10 minutes under Oracle 8.0, in less than 7 minutes under Oracle 8i, and in less than 5 minutes under Oracle 9i. Except for minor INIT.ORA changes, the only difference was the optimizer's chosen explain plan for the query.
Still not convinced? Let's examine the features people think exist in 7.X that make data warehouses a possibility:
Oracle 7.X's partitioning is really what's referred to as partition views. It's nothing more than a way to have a view definition tie together disjointed tables so as to give the appearance of partitioning. Partition views lack partition-based DML operations, partition-level query options, and partition-based indexing. Partition views are smoke and mirrors at best trying to resemble real partitioning. They don't cut it.
Oracle 7.X's bitmap indexes are totally unreliable. I logged so many TARs on bitmap indexes under both Oracle 7.X and 8.0 that I almost gave up on using them. Thank goodness 8i and 9i fixed these problems. If you like ORA-600 errors and wrong results, then by all means use bitmap indexes on large tables under Oracle 7.X.
 When you call Oracle technical support and log an issue or bug, you are given a TAR number to reference the occasion. TAR stands for technical assistance requests.
Oracle 7.X's STAR hint is also a joke. It does a Cartesian product of all the dimension tables and then joins that to the fact table. The thought was that doing one join was the way to go. And if I've got to actually convince you that Cartesian products are undesirable, then you're reading the wrong book.
Oracle 8.0 is the first Oracle version to meet many of the data warehousing feature requirements. But like new cars, the first model year or two are often worth avoiding. The partitioning is fairly sound, but the bitmap indexes remain problematic. Specifically, it seems that bitmap indexes on tables with over a few hundred million rows still raise a few ORA-600 errors and the occasional wrong result. If you must build a data warehouse under Oracle 8.0, then be advised that it will work best only for very small data warehouses.
Both Oracle 8i and 9i support all the data warehousing feature requirements. I've found both Oracle 8.1.7 and 9.0.1 to make data warehousing projects more likely to succeed?so much so that my advice is that you should only make an attempt at a data warehouse in these versions of Oracle, period. Now, many people might state that their ERP applications are still on Oracle 7.3 and their core business OLTP applications are primarily on Oracle 8.0?with a few smaller projects underway on either Oracle 8i or 9i. So what? The data warehouse is a new project and must have those features in the newer releases to succeed.
Here's another piece of advice that will sound hard to accept: Successful data warehouses rely so heavily on these new features that their DBAs tend to ride the bleeding edge of Oracle releases. For example, my 7-Eleven data warehouse was considered a huge success by any and all measures. Guess what? We were never more than 60 days out on any major upgrade or patch, ever. Yes, the rest of 7-Eleven was still on 7.3 and working on a phased plan to upgrade the ERP and OLTP systems over the following year to Oracle 8i. But, the data warehouse had already been on Oracle 8i (and its latest release) for over a year. In fact, we were already planning for Oracle 9i.
Another way to look at this is to review the market thrusts of both Oracle 8i and 9i. Each version, when released, included new key features primarily for two very hot market niches: the Web and data warehousing. The "Getting to Know Oracle 8i" document (Oracle Part #A68020-01) states that:
Oracle8i, the database for Internet computing, changes the way information is managed and accessed to meet the demands of the Internet age, while providing significant new features for traditional online transaction processing (OLTP) and data warehouse applications. It provides advanced tools to manage all types of data in Web sites, but it also delivers the performance, scalability, and availability needed to support very large database (VLDB) and mission-critical applications.
In the same document under data warehousing improvements, Oracle states:
In the Oracle8 Enterprise Edition, a new method for executing star queries has been introduced. Using a more efficient algorithm, and utilizing bitmapped indexes, the new star-query processing provides a significant performance boost to data warehouse applications.
Insert, update, and delete operations can now be run in parallel in the Oracle8 Enterprise Edition. These operations, known as parallel DML, are executed in parallel across multiple processes. By having these operations execute in parallel, the statement will be completed much more quickly than if the same statement were executed in a serial fashion. Parallel DML complements parallel query by providing parallel transaction execution as well as queries. Parallel DML is useful in a decision support (DSS) or data warehouse environment where bulk DML operations are common. However, parallel DML operations can also speed up batch jobs running in an OLTP database.
The Oracle8 Enterprise Edition can manage databases of hundreds of terabytes in size because of partitioning, administrative improvements, and internal enhancements. Many size limitations in earlier versions of Oracle have been raised, such as the number of columns per table, the maximum database size, and the number of files per database.
Likewise, "Oracle9i Database New Features" [Oracle Part #A90120-02] states:
Oracle9i broadens the footprint of the relational database in a data warehouse by becoming a scalable data engine for all operations on data warehousing data, and not just in loading and basic query operations. As such, it is the first true data warehouse platform. Oracle9i provides new server functionality in analytic capabilities, ETL (Extraction, Transformation, Loading), and data mining.
Moreover, "Oracle9i Database 9.2 New Features" [Oracle Part #A96531-01] states:
Oracle9i release 2 continues to challenge the competition by providing the best platform support for business intelligence in medium to large-scale enterprises. Oracle9i technology focuses especially on the challenges raised by the large volume of data and the need for near real time complex analysis in an Internet-enabled environment.
It should be clear that Oracle 8i and 9i are clearly targeted for the world of data warehousing.