As the world becomes ever more data oriented, a much larger emphasis is being placed on getting data from one place to another. To complicate matters, the data can be stored in many different formats, contexts, file systems, and locations. In addition, the data often requires significant transformation and conversion processing as it is being moved around. Whether you are trying to move data from Excel to SQL Server, create a data mart (or data warehouse), or distribute data to heterogeneous databases, you are essentially enabling someone with data.
This section will describe Microsoft's Data Transformation Services (DTS) environment and how it is addressing these needs. As mentioned earlier, the focus will be on importing, exporting, and transforming data from one or more datasources to one or more data targets.
Common requirements of DTS might include the following:
Exporting data out of SQL Server tables to other applications and environments (ODBC or OLE DB datasources, or via flat files)
Importing data into SQL Server tables from other applications and environments (ODBC or OLE DB datasources, or via flat files)
Initializing data in some data replication situations, such as initial snapshots
Aggregating data (data transformation) for distribution to/from data marts or data warehouses
Changing the data's context or format before importing or exporting it (data conversion)
Some typical business scenarios might include the following:
Enabling data marts to receive data from a master data warehouse through periodic updates (see Figure 20.6)
Populating a master data warehouse from legacy systems (see Figure 20.7)
Initializing heterogeneous replication subscriber tables on Oracle from a SQL Server 2000 Publisher (see Figure 20.8)
Pulling sales data directly into SQL Server 2000 from an Access or Excel application (see Figure 20.9)
Exporting static time-reporting data files (flat files) for distribution to your remote consultants
Importing new orders directly or indirectly from your sales force or distributed sales systems
You might be thinking at this point that DTS is what you might need for something you are trying to build. In general, DTS is what you will need if any of the following conditions exist:
You need to import data directly into SQL Server from one or more ODBC datasources, OLE DB datasources, or via flat files.
You need to export data directly out of SQL Server to one or more ODBC datasources, OLE DB datasources, or via flat files.
You need to perform data conversions, transformations, or aggregations on data from one or more datasources for distribution to one or more data targets. You also need to access the data directly via any ODBC datasource, OLE DB datasource, or via flat files.
Your bulk data movement doesn't have to be faster than the speed of light. Unfortunately, DTS must utilize conventional connection techniques to these datasources. It also must create intermediate buffers to hold data during the transformation steps. This usually disqualifies DTS on the high performance side of requirements (for large, bulk data movements with any type of data transformations defined at least). Alternative importing/exporting facilities like BCP offer better performance but lack the flexibility of DTS.