The examples you have worked to this point have been from a dimensional database that uses a star or snowflake schema.
Designing dimensional databases is an art form and requires not only sound dimensional modeling knowledge, but also knowledge of the business processes with which you are dealing. Data warehousing has several design approaches. Regardless of which approach you take, having a good understanding of the approach's design techniques is critical to the success of a data warehouse project. Although Microsoft provides a powerful set of tools to implement data marts, astute execution of design methods is critical to getting the correct data?the truly business-significant business data?to the end users.
Analysis Services can connect to a relational database and allow you to build dimensions and define measures from it.
You would typically create something like a time dimension using a view as the dimension table and selecting the Time Dimension option. You would, of course, also select any other dimensions that you need that can be fulfilled by the relational table being accessed. After you have the dimensions, you can create a cube.
You just select another view (or the table) as the fact table and the appropriate numeric column(s) for the measures.
Process the cube as HOLAP. Select 80 percent for the Estimated Percentage of Optimization in the Aggregation option and process it. The cube should now be useable (can be browsed).
Even using a tool such as Analysis Services, you have limitations when dealing with a normalized database. Using a view can often solve (or mask) these issues. In some cases, however, more complicated facts and dimensions might require denormalized tables or a dimensional database in the storage component of the data warehouse to bring information together. Data cleansing and transformation are also major considerations before you attempt to present decision makers with data from OLTP systems.
Microsoft Analysis Services provides a great deal of flexibility for building scalable OLAP solutions, but how do you present the data to users? The answer lies in the other major component of Microsoft Decision Support Services: the Pivot Table Service. This client-side component delivers much of the functionality of Analysis Services using the same code base for the dimensional calculation engine, caching, and query processing. The Pivot Table Service manages client/server connections and is the layer for user interfaces to access Analysis Services cubes through the OLE DB for OLAP interface. ActiveX Data Objects Multidimensional (ADO MD) provides an application-level programming interface for development of OLAP applications. Third-party tools and future versions of Microsoft Excel and other Microsoft Office products will use the Pivot Table Service to access cubes.
The Pivot Table Service shares metadata with the Analysis Services, so a request for data on the client causes data and metadata to be downloaded to the client. The Pivot Table Service determines whether requests need to be sent to the server or can be satisfied at the client with downloaded data. If a user requests sales information for the first quarter of 1998 and then later decides to query that data for the first quarter of 1997 for comparison, only the request for 1997 data has to go to the server to get more data. The 1998 data is cached on the client.
Slices of data that are retrieved to the client computer can also be saved locally for analysis when the client computer is disconnected from the network. Users can download the data in which they are interested and analyze it offline. The Pivot Table Service can also create simple OLAP databases by accessing OLE-DB?compliant datasources.
With the ADO MD interface, developers will be able to access and manipulate objects in an Analysis Services database, enabling Web-based OLAP application development.
Many independent software vendors (ISVs) such as Brio are working with Microsoft to leverage the rich features of the OLAP Services. They offer robust user interfaces that can access Analysis Services and Pivot Table Service functionality. Versions of Microsoft Office will include the Pivot Table Service to enable built-in analysis in tools such as Excel.
The OLE DB for OLAP specification contains Multidimensional Expressions (MDX) syntax that is used to build datasets from cubes. Developers of OLE DB OLAP providers can map MDX syntax to SQL statements or native query languages of other OLAP servers, depending on storage techniques.
MDX statements build datasets by using information about cubes from which the data will be read. This includes the number of axes to include, the dimensions on each axis and the level of nesting, the members or member tuples and sort order of each dimension, and the dimension members used to filter, or slice, the data. Tuples are combinations of dimensions such as time and product time that present multidimensional data in a two-dimensional dataset.
An MDX statement has three basic parts:
Dimension, measure, and axis information in the SELECT clause
The source cube in the FROM clause
Dimension slicing in the WHERE clause
Expressions in an MDX statement operate on numbers, strings, members, tuples, and sets. Numbers and strings mean the same thing here as they do in other programming contexts. Members are the values in a dimension, and levels are groups of members. If the dimension were time, then a particular year, quarter, or month would be a member, and month values would belong to the month level. Use the dimension browser in the Analysis Manager to view members of a dimension. Sets are collections of tuple elements to further combine facts.
An example of an MDX statement and its results in the MDX sample application that installs with Microsoft DSS are shown in Figure 42.36. You can launch the sample MDX application from the Analysis Services program group (MDX Sample Application program item).
This is a simple MDX statement that shows the basic parts of a working query. In this case, measures are displayed in columns, and product and store dimension members make up the axes of this multidimensional query and are displayed in rows. The display of multiple dimensions in rows like this is how the term tuple is used in the context of Microsoft Analysis Services.
A number of other examples are included in the sample MDX application using the FoodMart2000 sample database.
Much more could be said about MDX syntax, and a complete discussion of MDX is probably worth its own chapter. The OLE DB for OLAP Programmers Reference is available on the Microsoft Web site at http://msdn.microsoft.com/library/default.asp?url=/libraryen-us/oledb/htm/olapintroduction_to_ole_db_for_olap.asp. It contains detailed information about MDX.
ActiveX Data Objects Multidimensional (ADO MD) is an easy-to-use access method for dimensional data via an OLE DB for OLAP provider. You can use ADO MD in Visual Basic, Visual C++, and Visual J++. Like ADO, ADO MD offers a rich application-development environment that can be used for multitier client/server and Web application development.
You can retrieve information about a cube, or metadata, and execute MDX statements using ADO MD to create "cellsets" to return interesting data to a user. ADO MD is another subject too broad to cover in detail in this chapter. Specifications for OLE DB for OLAP and ADO MD are available on the Microsoft Web site at www.microsoft.com.