This post is one of a series dealing with common Enterprise Wide Data Governance Issues. Assess the status of this issue in your Enterprise by clicking here: Data Governance Issue Assessment Process
Too often, Data Migration and ETL projects are built on the basis of Metadata, without measuring what is actually contained in the source data fields. This happens when the IT function build data ‘pipes’ on the basis of what the metadata says the source fields should contain, and don’t perform data content analysis, or data profiling, to find out what the source fields actually contain.
The IT function turn the ‘tap’ on, the data flows through the ‘pipes’ and the business express surprise, followed by denial, when expectations cannot be met due to data quality issues. This is known as the ‘Load and Explode’ approach to data.
To prevent ‘Load and Explode’ impacting the success of your data dependent projects, agree and apply the following policy:
Before building, or purchasing a system that is dependent on existing data, projects must complete the following process:
- Define what data is required.
- Define the quality requirements of the required data.
- Identify the source of the required data.
- Specify the data quality metrics to be captured.
- Measure the quality of the available source data.
- Understand the implications of the quality of available source data for the proposed system.
- If necessary, and if feasible, implement data quality improvement measures to raise the quality to the required level.
- Worst case – if the facts tell you data quality is too low and cannot be improved – Cancel the project and save yourself a ton of money!
Have you faced the above issue in your organisation, or while working with clients? What did you do to resolve it? Please share your experience by posting a comment – Thank you – Ken.