Warehouses Evolve without Phases

The typical database application development life cycle is something like the following:

  • Deliver a version.

  • Begin work on the next version.

  • Perform maintenance on the current version.

  • Promote changes or deltas to the current version.

  • Incorporate changes or deltas into the new version.

  • Repeat the process.

For EDS and 7-Eleven, we had to have a customer signature and scheduled downtime to promote a database application change for any OLTP system. This practice makes good business sense. When OLTP systems run the customer's business, you don't want to make unapproved or unscheduled changes that could result in customer OLTP application downtime, because such downtime could cost the customer real money.

In data warehousing, things are very different. There really is no database application as the database itself is the object of desire from the customer's viewpoint. The data warehouse may be queried by end-user tools and have batch programs for loading, but the database itself is really the heart and soul of the data warehouse. Customers see its information at their disposal as the real deliverable. Or, as I sometimes like to say, "It's the database, Stupid."

As users mine the data warehouse to answer new and more involved business questions, they quite often and regularly find something lacking. The most common requests are often to add a new column to a table or create a new summarization or aggregate table that does not exist. The first solves a missing data problem and the second reduces report runtimes. In addition, users often ask for columns to be displayed differently or contain additional data. The point is that change requests come in daily, from mid-level managers to true executives.

So, the data warehousing application development lifecycle looks more like:

  • Deliver the first version.

  • Promote changes or deltas to the current version.

  • Repeat the process.

This evolutionary method actually requires a much more cautious approach to promoting changes. The batch load programmers, the DBA, and the project manager must all be 100% in sync with each other at all times because there is no real version control of the code or database data definition language (DDL) to fall back on. Data warehouse changes occur with too much frequency and urgency to follow a strict development methodology. From the OLTP perspective, the data warehouse team appears to fly by the seat of their pants. So, a great project manager, a detail-oriented project lead, and a very experienced DBA are needed to make this process work.