Querying Star Schemas

End-users will generally utilize business intelligence tools to submit report requests. In Oracle terms, each request may generate a series of SELECT statements necessary to construct the complete report. For example, your users' business intelligence tool may have a server process of its own that processes the returned SQL SELECT results using advanced OLAP functions and logic not yet found within Oracle. So as the DBA, you'll only be able to see the intermediate queries by scanning the SGA. Not to worry, though, for generally if we can make the intermediate queries faster, the overall report execution will improve as well.

A typical user query might be something like the following: How much beer and coffee did we sell in our Dallas stores during December 1998? Using our example dimensional model in Figure 4-1, a typical ad-hoc business intelligence intermediate query submitted to Oracle might look something like:

SELECT prod.category_name,
       sum (fact.sales_unit) Units,
       sum (fact.sales_retail) Retail
FROM   pos_day              fact,
       period               per,
       location             loc,
       product              prod
WHERE  fact.period_id     = per.period_id
  AND  fact.location_id   = loc.location_id
  AND  fact.product_id    = prod.product_id
  AND  per.levelx         = 'DAY'
  AND  per.period_month   = 12
  AND  per.period_year    = 1998
  AND  loc.levelx         = 'STORE'
  AND  loc.city           = 'DALLAS'
  AND  loc.state          = 'TX'
  AND  prod.levelx        = 'ITEM'
  AND  prod.category_name in ('BEER','COFFEE')
GROUP BY prod.category_name;

with results of:

CATEGORY_NAME                 UNITS         RETAIL
---------------------- ------------ --------------
BEER                         11,613      64,490.81
COFFEE                       22,808      20,462.92

Even this example yields key insights into what the DBA can expect in the way of queries against star schema warehouses. In general, a star schema SELECT will:

  • Use GROUP functions and therefore GROUP BY.

  • Contain a JOIN of a fact with one or more dimensions.

  • Possess lots of WHERE restrictions using dimension columns.

  • Scan lots of rows to return relatively few rows of results.

Conversely, business intelligence tool-generated SQL may target the wrong level of table summarization (i.e., example query should have used the month table instead of the day table, and therefore offers opportunities for tuning or utilizing query rewrites).

That's it. Star schema queries are just that simple. In fact, they are nothing more than table searches via some lookup tables. Oracle has been able to handle queries of this nature for years, but the sheer size of data warehouses make this something altogether different.