Common Enterprise wide Data Governance Issues #9: Data Migration and ETL projects are Metadata driven

This post is one of a series dealing with common Enterprise Wide Data Governance Issues.  Assess the status of this issue in your Enterprise by clicking here: Data Governance Issue Assessment Process

Too often, Data Migration and ETL projects are built on the basis of Metadata, without measuring what is actually contained in the source data fields.  This happens when the IT function build data ‘pipes’ on the basis of what the metadata says the source fields should contain, and don’t perform data content analysis, or data profiling, to find out what the source fields actually contain.

Impact:
The IT function turn the  ‘tap’ on, the data flows through the ‘pipes’ and the business express surprise, followed by denial, when expectations cannot be met due to data quality issues.  This is known as the ‘Load and Explode’ approach to data.

Solution:
To prevent ‘Load and Explode’ impacting the success of your data dependent projects, agree and apply the following policy:

Before building, or purchasing a system that is dependent on existing data, projects must complete the following process:

  1. Define what data is required.
  2. Define the quality requirements of the required data.
  3. Identify the source of the required data.
  4. Specify the data quality metrics to be captured.
  5. Measure the quality of the available source data.
  6. Understand the implications of the quality of available source data for the proposed system.
  7. If necessary, and if feasible, implement data quality improvement measures to raise the quality to the required level.
  8. Worst case – if the facts tell you data quality is too low and cannot be improved – Cancel the project and save yourself a ton of money!

Your experience:
Have you faced the above issue in your organisation, or while working with clients?  What did you do to resolve it?  Please share your experience by posting a comment – Thank you – Ken.

Common Enterprise Wide Data Governance Issues: #8 Accessibility of Data is poor

This post is one of a series dealing with common Enterprise Wide Data Governance Issues.  Assess the status of this issue in your Enterprise by clicking here: Data Governance Issue Assessment Process

Accessibility of Data is poor.

  • Business users find it difficult to identify who to ask regarding where to find required data (information).
    e.g. Compliance Units regularly receive regulatory requests for new details.  In some enterprises it is unclear who the Compliance Unit should ask regarding locating the required data (information).
  • Busines users experience difficulty and delays in locating required data (information).
  • Miscommunication between the business and IT lead to IT misinterpreting the data requirement and identifying inappropriate data.

Impact: Projects dependent on existing data must locate the data from first principles, and face the risk of not finding the data, or identifying inappropriate sources.

Solution:
Agree and implement the following policies:

  1. Overall ownership for data within the Enterprise lies with the CIO.
  2. Ownership for data within each Business Unit lies with the CIO and the head of the Business Unit.
  3. The CIO and head of each business unit must appoint a person with responsibility for the provision of data from that business unit, to those who require it.  This person is also responsible for the measurement and maintenance of data quality within the business unit.
  4. The CIO and head of each business unit must appoint a single point of contact to handle requests for data / information.

Your experience:
Have you faced the above issue in your organisation, or while working with clients?  What did you do to resolve it?  Please share your experience by posting a comment – Thank you – Ken.

Follow Friday – My Master Data Copy

Henrik Sørensen, @hlsdkis a data expert based in Copenhagen.
Henrik came up with the excellent idea applying Data Management to #FF #FollowFriday within Twitter.
You may view Henrik’s original blog post at http://liliendahl.wordpress.com/2009/07/31/follow-friday-master-data-hub/

Henrik not only includes the Twittter ID, but the person’s LinkeIN ID (He performs #datamatching with LinkedIn connections to improve #dataquality through Identity Resolution).

Added Monday 31st August 2009 (takes time to manage !)

@ocdqblog is a blog where http://www.linkedin.com/in/jimharris is blogger-in-chief

@dataqualitypro is an online community for Data Quality Professionals founded byhttp://www.linkedin.com/in/dylanjones

@Datanomic Steve Tuck – Chief Strategy Officer at Datanomic http://www.linkedin.com/in/stevetuck

@stevesarsfield Data Governance Evangelist at Harte-Hanks Trillium Software http://www.linkedin.com/pub/steve-sarsfield/2/675/47a

@bonnieoneill data architect and author of 3 books 

Added Friday 14th August 2009

@hlsdk is http://www.linkedin.com/in/henrikliliendahlsoerensen

Common Enterprise Wide Data Governance Issues: #7 No SLAs defined for required quality of critical data

This post is one of a series dealing with common Enterprise Wide Data Governance Issues.  Assess the status of this issue in your Enterprise by clicking here:  Data Governance Issue Assessment Process

In some organisations there is a platitute that states: ‘The Business is responsible for the quality of the data’, but…

  • There are no SLAs defined for the required quality of critical data (Master Data)
  • There is no measurement performed of the actual quality of the data
  • The are no processes available to “The Business” to enable them to measure the quality of the data

Impact: Multiple enterprise wide data quality issues.

Solution:
Agree and implement the following policies:

  1. “The business” must be provided with Standard Enterprise wide data quality measurement processes and tools
  2. Business units must regularly measure the quality of critical data, using standard Enterprise wide processes and tools, and must agree SLAs with the users of the data defining the target quality level.
  3. Where necessary, business units must implement data quality improvement measures to meet the quality defined in the agreed SLA.

Your experience:
Have you faced the above issue in your organisation, or while working with clients?  What did you do to resolve it?  Please share your experience by posting a comment – Thank you – Ken.

Common Enterprise Wide Data Governance Issues: #6 No Enterprise-Wide Data Quality Measurement of existing Data Content

This post is one of a series dealing with common Enterprise Wide Data Governance Issues.  Assess the status of this issue in your Enterprise by clicking here:  Data Governance Issue Assessment Process

Impact: ‘If you do not measure, you cannot manage.’

There is a risk that either Data Quality is not measured at all, or is measured on a piece-meal localised basis, using non-standard processes and producing questionable results.

Solution:
Agree and implement the following policy:

  1. The quality of critical data (Master Data), in critical databases must be measured regularly, using an Enterprise wide standard set of processes and tools, producing a standard set of quality metrics covering a standard set of quality dimensions.
  2. Data quality SLAs must be agreed and implemented between the owners of the data, and the end users of the data.

Your experience:
Have you faced the above issue in your organisation, or while working with clients?  What did you do to resolve it?  Please share your experience by posting a comment – Thank you – Ken.

Common Enterprise Wide Data Governance Issues: #5 There is little understanding of what “Data Quality” means

This post is one of a series dealing with common Enterprise Wide Data Governance Issues.  Assess the status of this issue in your Enterprise by clicking here:  Data Governance Issue Assessment Process

When asked what does ‘Data Quality’ mean, senior management respond along the lines of ‘the data is either good (accurate) or bad (inaccurate)’.  There is little understanding of the commonly used dimensions of data quality.

  • Completeness
    Is the data populated ?
  • Validity
    Is the data within the permitted range of values ?
  • Accuracy
    Does the data represent reality or a verifiable source ?
  • Consistency
    Is the same data consistent across different files/tables ?
  • Timeliness
    Is the data available when needed ?
  • Accessibility
    Is the data easily accessible, understandable and usable ?

Impact: Without a shared understanding of what “Data Quality” means:

  • It is practically impossible to have a meaningful discussion about the existing and required data quality within an Enterprise.
  • Senior management are not in a position to request specific Data Quality metrics, and if you don’t measure, you can’t manage.
  • Business users are not in a position to clearly state the level of data quality they require.

Solution:
Agree and implement the following policy:

In discussing Data issues and requirements, data quality will be assessed using a standard set of quality dimensions across the Enterprise.

Your experience:
Have you faced the above issue in your organisation, or while working with clients?  What did you do to resolve it?  Please share your experience by posting a comment – Thank you – Ken.

Update:  
In October 2013, following an 18 month consultative process, DAMA UK published a white paper explaining 6 primary data quality dimensions.

1. Completeness
2. Uniqueness
3. Timeliness
4. Validity
5. Accuracy
6. Consistency

For more details see my blog post, Major step forward in Data Quality Measurement.