Part I: Understanding Data Accuracy

Chapter 1: The Data Quality Problem
Chapter 2: Definition of Accurate Data
Chapter 3: Sources of Inaccurate Data

Data quality is gaining visibility daily as an important element in data management. More and more companies are discovering that data quality issues are causing large losses in money, time, and missed opportunities. The cost of poor quality is usually hidden and not obvious to those not looking for it.

Data management technology has focused on the containers we put data in. We have made huge strides in developing robust database management software, transaction monitors, data replication services, security support, and backup and recovery services. These are all technologies that support the efficient gathering, storage, and protection of data. We have also created an extensive technology for accessing data. Data warehouse, data mart, data mining, and decision support technologies have all seen explosive growth, along with a great deal of sophistication.

With all of this, we have not done much about the actual data itself. Data quality technology has lagged behind these other areas. The robustness of the other technologies has brought about rapid growth in the amount of data we collect and the uses we put it to.

The lack of managing the content is now beginning to emerge as a major problem. Companies all over the globe are instituting data quality improvement programs. They are looking for education and tools to help them begin to get the content to the same level of robustness as the containers that hold it.

The first three chapters position data accuracy within the larger topic of data quality. They define the scope and severity of the data quality problems facing corporations. Data accuracy is rigorously defined. The causes of data inaccuracies are classified and discussed in order to show the breadth of challenges any program needs to address if there is any chance of making a meaningful contribution.