Dimensional Hierarchies

One of the hardest star schema design concepts to initially grasp is that of dimensional hierarchies. I've found more confusion in this one area alone than with all other star schema design issues combined. However, dimensional hierarchies are truly one of the easiest star schema design concepts to understand?once explained properly.

First, note I said the "concept" of dimensional hierarchies. Oracle 8i and 9i both support the CREATE DIMENSION statement for actually documenting the existence of a dimensional hierarchy. Note that this statement merely creates meta-data within the Oracle data dictionary to describe the dimensional hierarchy such that materialized view query rewrites work properly. There is no physical database object created by the CREATE DIMENSION statement. But we'll discuss this in a later chapter where we specifically address using materialized views. For now, we mean implementing a dimensional hierarchy manually using plain, old tables (i.e., without using Oracle's CREATE DIMENSION statement).

Look back at the dimension tables in Figure 4-1. Each has a third column of LEVELX. This is the dimensional hierarchy column. What does this column mean or do?

According to Oracle: "A dimension defines a parent?child relationship between pairs of column sets, where all the columns of a column set must come from the same table." That's the hard way (it seems) of saying that each dimension table represents multiple types of information (i.e., different record types) as represented by the columns that contain data for a given row.

Let's look at a specific example. The LEVELX column of the PERIOD dimension from Figure 4-1 has the following distinct values:

  • DAY

  • WEEK

  • MONTH

  • QUARTER

  • YEAR

Let's assume that our end-user is using a business intelligence tool and wants to use the PERIOD dimension to specify a query based on a quarter. Only certain columns from the PERIOD table apply?in fact, just the PERIOD_QUARTER column. All other columns should not have values. If the query focus was month, then only the columns PERIOD_QUARTER and PERIOD_MONTH should have values. In effect, we're saying that a column's mandatory versus optional property depends on the value of another column from the same row of data. This requires all non-primary and non-unique key columns for the dimension table to be optional. This in turn requires the DBA to write some pretty complex table check constraints.

For instance, for the PERIOD table from Figure 4-1, the table-level check constraint necessary is:

CONSTRAINT PERIOD_LEVELX
  CHECK ( ( LEVELX = 'YEAR' )
          OR
          ( LEVELX = 'QUARTER' AND
            PERIOD_QUARTER IS NOT NULL )
          OR
          ( LEVELX = 'MONTH' AND
            PERIOD_QUARTER IS NOT NULL AND
            PERIOD_MONTH IS NOT NULL )
          OR
          ( LEVELX = 'WEEK' AND
            PERIOD_QUARTER IS NOT NULL AND
            PERIOD_MONTH IS NOT NULL AND
            PERIOD_WEEK IS NOT NULL AND
            WEEK_NUMBER IS NOT NULL )
          OR
          ( LEVELX = 'DAY'  AND
            PERIOD_WEEK IS NOT NULL AND
            PERIOD_MONTH IS NOT NULL AND
            PERIOD_QUARTER IS NOT NULL AND
            DAY_NUMBER_OF_WK IS NOT NULL AND
            DAY_NUMBER_OF_MTH IS NOT NULL AND
            HOLIDAY_FLAG IS NOT NULL AND
            WEEKEND_FLAG IS NOT NULL AND
            WORKDAY_FLAG IS NOT NULL AND
            DAY_OF_WK IS NOT NULL AND
            WEEK_NUMBER IS NOT NULL )
        )

I have two final thoughts regarding dimensional hierarchies. First, most data modeling tools generally will not help you with writing complex table check constraints such as this example. Second, the check constraint should enforce whatever business rules the business intelligence software requires. Don't get caught up in an opinion on this one. Just build whatever is necessary.