Inaccurate data gets into databases at a number of points and for a variety of reasons. Any program to improve data accuracy must address the issues across the entire spectrum of opportunities for error.
Data can be entered mistakenly, can be deliberately entered inaccurately, can be the result of system errors, can decay in accuracy, can be turned into inaccurate data through moving and restructuring, and can be turned into wrong information when inappropriately reported or used. Understanding all of these areas will make data quality assurance professionals more expert in analyzing data and processes for inaccuracies.
The common theme throughout this chapter is that knowledge about your data is the key to successful assessment, movement of data, and use of data. There is no substitute for a sound knowledge base of information about the data. Most metadata repositories fall short of the need. If a process similar to that shown in Figure 3.4 is used vigorously, updating the metadata repository at all stages, higher data accuracy will result for those making decisions from the data.
The area of metadata repositories is ripe for new development, meaningful standardization, and widespread deployment. It should emerge as the next most important technology corporations demand after they have made significant progress on data quality. The business case for repository projects will emerge as the cost of not having them (via inhibiting data quality improvement efforts) becomes more clear. The business value of data quality initiatives will generally depend on the ability of the corporation to maintain an accurate metadata repository. To the extent that it does not, it will not reap the maximum value from data quality efforts.