Data is the new oil – what grade is yours?

Bill Bryson’s book “One Summer: America 1927” provides a fascinating insight into the world of Aviation in the “roaring 20’s”. Aviators were vying to be the first to cross the Atlantic from New York to Paris, a challenge that took many lives, most of which were European.  

Bryson tells us “The American flyers also had an advantage over their European counterparts that nobody yet understood. They all used aviation fuel from California, which burned more cleanly and gave better mileage. No one knew what made it superior because no one yet understood octane ratings – that would not come until the 1930s – but it was what got most American planes across the ocean while others were lost at sea.

Once octane ratings were understood, fuel quality was measured and lives were saved.

We’ve all heard that data is the new oil. To benefit from this “new oil”, you must ensure you use “top grade” only. It can make the difference between business success and failure. It is also a prerequisite for Regulatory compliance, (GDPR, Solvency II, FATCA, Dodd Frank, Basel III, BCBS 239 etc.). Thankfully, like octane ratings, we know how to measure data quality using 6 primary dimensions: completeness; validity; accuracy; uniqueness; timeliness and consistency. For more details see my post: Major step forward in Data Quality Measurement.

I also explore this topic in my post Russian Gas Pipe and Data Governance.

What happens in your organisation? Do you measure the quality of your most critical data, or do you fly on a wing and a prayer? Please add your comments below.

Do you know what’s in the data you’re consuming?

Standard facts are provided about the food we buy

These days, food packaging includes ingredients and a standard set of nutrition facts.  This is required by law in many countries.

Food consumers have grown accustomed to seeing this information, and now expect it. It enables them to make informed decisions about the food they buy, based on a standard set of facts.

Remarkable as it may seem, data consumers are seldom provided with facts about the data feeding their critical business processes.

Most data consumers assume the data input to their business processes is “right”, or “OK”.  They often assume it is the job of the IT function to ensure the data is “right”.  But only the data consumer knows the intended purpose for which they require the data.  Only the data consumer can decide whether the data available satisfies their specific needs and their specific acceptance criteria. To make an informed choice, data consumers need to be provided with facts about the data content available.

Data Consumers have the right to make informed decisions based on standard data content facts

The IT function, or a data quality function, can, and should provide standard “data content facts” about all critical data such as the facts shown in the example.

In the sample shown, a Marketing Manager wishing to mailshot customers in the 40-59 age range might find that the data content facts satisfy his/her data quality acceptance criteria.

The same data might not satisfy the acceptance criteria for a manager in the Anti Money Laundering (AML) area requesting an ETL process to populate a new AML system.

Increasing regulation means that organisations must be able to demonstrate the quality and trace the origin of the data they use in critical business processes.

In Europe, Solvency II requires insurance and re-insurance undertakings to demonstrate the data they use for solvency calculations is as complete, appropriate and accurate as required for the intended purpose. Other regulatory requirements such as Dodd Frank in the USA, BASEL III and BCBS 239 are also seeking increasing transparency regarding the quality of data underpinning our financial system.

While regulation may be a strong driving force for providing standard data content facts, an even stronger one is the business benefit that to be gained from being informed.  Some time ago Gartner research showed that approximately 70% of CRM projects failed.  I wonder were the business owners of the proposed CRM system shown data content facts about the data available to populate the proposed CRM system?

In years to come, we will look back on those crazy days when data consumers were not shown data content facts about the data they were consuming.

Know your data

You must know your data.

Do you know what’s in your data box of chocolates?

You must know where it is, what it should contain and what it actually contains.

When your data does not contain what it should, you must have a process for correcting it.

CEOs, CFOs and CROs often take the above as “given”.  They make business critical decisions using information derived from data within their organisation.  After all, its applied common sense.

For the insurance industry, Solvency II requires evidence that you are applying common sense.

If you operate in the EU market or process the personal data of EU data subjects, you must comply with the EU General Data Protection Regulation (GDPR) or face severe fines. To comply, you must “know your (personal) data” and how you manage it.

In my experience, data is like a box of chocolates “You never know what you’re gonna get.”

Do you know your data?

Charter of Data Consumer rights and responsibilities

Time for charter of Data Consumer rights and responsibilities

There are many rights enshrined in law that benefit all of us. One example is the UN Charter of Human Rights.  Another example is the “Consumer Rights” protection most countries enforce to guarantee us, the buying public, the right to expect goods and services that are of good quality and “fit for purpose”.  As buyers of goods and services, we also have responsibilities.  If you or I buy a “Rolex watch” for $10 from a casual street vendor, we cannot claim consumer protection rights if the watch stops working within a week. “Let the buyer beware” or “Caveat Emptor” is the common sense responsibility that we, as consumers must observe.

I have previously written about business users’ right to expect good data plumbing. Business users (of data) have responsibilities also.  I believe its time to agree a charter of rights and responsibilities for them.  Business users of data are “Data Consumers” – people who use data to perform their work, whatever work that may be.  Data Consumers make decisions based on the data or information available to them. Examples can range from a doctor prescribing medication based on the information in a patient’s health records, to a multi-national chief executive deciding to buy a business based on the performance figures available, to an actuary developing an internal model to determine Solvency II Capital Requirements.

What rights and responsibilities should data consumers have?

Here’s my starter set:

  • The right to expect data that is “fit for purpose”, data that is complete, appropriate and accurate.
  • The responsibility to define what “fit for purpose” data means to them.
  • The right to expect guidance and assistance in defining what constitutes complete, appropriate and accurate data for them.
  • The responsibility to explain the impact that “sub-standard” data would have on the work they do.
  • The right to be informed of the actual quality of the data they use.
  • The right to expect controls in place that verify the quality of the data they use meets the standard they require.

What do you think ? Please feedback your suggestions:

How do you collect your data?

Welcome to part 4 of Solvency II Standards for Data Quality – common sense standards for all businesses.

In my last post I highlighted the Solvency II requirement for Data Quality Management processes, which must include:

  • Assessment of the quality of your data
  • Resolution of material problems identified
Have you included plans for data cleansing to resolve material problems identified? Furthermore, have you considered how you plan to prevent the problems recurring? Solvency II requires you to do this, as set out in the following paragraphs of the CEIOPS’ (EIOPA) advice (Consultation Paper 43):

3.36 The assessment of data quality should have due regard to the quality and performance of the channels used to collect, store, process and transmit data…

Your “Data Supply Chain” is the means by which you “Collect, store, process and transmit data…”. You are expected to know your data supply chain, and to manage it effectively.

3.37 If material problems with the verification of the data quality criteria have been identified, the insurer should try to solve them within an appropriate timeframe… and should work towards the improvement of the data collection, storage or other relevant internal processes, so as to ensure the quality of the future data. Those data limitations should be appropriately documented, including a description of how such situations can be remedied and the assignment of responsibilities within the undertaking.

How do you collect your data?

Solvency II mandates Data Governance

Welcome to part 3 of Solvency II Standards for Data Quality – common sense standards for all businesses.

Regardless of the industry you work in, you make critical business decisions based on the information available to you.  You would like to believe the information is accurate.  I suggest the CEIOPS’ standards for “Accuracy”apply to your business, and your industry, just as much as they apply to the insurance industry.  I would welcome your feedback…

The CEIOPS (now renamed EIOPA) advice makes it clear that Solvency II requires you to have Data Governance in place (which CEIOPS / EIOPA refers to as “internal systems and procedures”).   The following sections of the document make this clear:

3.32 In order to ensure on a continuous basis a sufficient quality of the data used in the valuation of technical provisions, the undertaking should have in place internal systems and procedures covering the following areas:

• Data quality management;

• Internal processes on the identification, collection, and processing of data; and

• The role of internal/external auditors and the actuarial function.

3.1.4.1 Data quality management – Internal processes

3.33 Data quality management is a continuous process that should comprise the following steps:

a) Definition of the data;

b) Assessment of the quality of data;

c) Resolution of the material problems identified;

d) Monitoring data quality.

I will explore the above further in my next post.  Meanwhile, what Data Quality Management processes do you have in place?  Do you suffer from common Enterprise-Wide Data Governance Issues?

What does complete appropriate and accurate mean?

Welcome to part 2 of Solvency II Standards for Data Quality – common sense standards for all businesses.

The Solvency II Standards for Data Quality run to 22 pages and provide an excellent substitute to counting sheep if you suffer from insomnia. They are published by The Committee of European Insurance and Occupational Pensions Supervisors (CEIOPS) (now renamed as EIOPA).

Solvency II Data Quality Standards – not as page turning as a Dan Brown novel

I accept that Data Quality Standards cannot aspire to be as page turning as a Dan Brown novel – but plainer English would help.

Anyway – enough  complaining.  As mentioned in part 1, the standards require insurance companies to provide evidence that their Solvency II submissions are based on data that is “as complete, appropriate, and accurate as possible”.  In this post, I will explore what the regulator means by “complete”, “appropriate” and “accurate”.  I will look at the terms in the context of data quality for Solvency II, and will highlight how the same common sense standards apply to all organisations.

APPROPRIATE: “Data is considered appropriate if it is suitable for the intended purpose” (page 19, paragraph 3.62).

Insurance companies must ensure they can provide for insurance claims. Hence, to be “appropriate”, the data must relate to the risks covered, and the value of the capital they have to cover potential claims.  Insurance industry knowledge is required to identify the “appropriate” data, just as Auto Industry knowledge is required to identify data “appropriate” to the Auto industry etc.

COMPLETE: (This one is pretty heavy, but I will include it verbatim, and then seek to simplify – all comments, contributions and dissenting opinions welcome) (page 19, paragraph 3.64)

“Data is considered to be complete if:

  • it allows for the recognition of all the main homogeneous risk groups within the liability portfolio;
  • it has sufficient granularity to allow for the identification of trends and to the full understanding of the behaviour of the underlying risks; and
  • if sufficient historical information is available.”

As I see it, there must be enough data, at a low enough level of detail, to provide a realistic picture of the main types of risks covered. Enough Historical data is also required, since history of past claims provides a basis for estimating the scale of future claims.

As with the term “Appropriate”,  I believe that Insurance industry knowledge is required to identify the data required to ensure that data is “complete”.

ACCURATE: I believe this one is “pure common sense”, and applies to all organisations, across all industries. (page 19, paragraph 3.66)

Data is considered accurate if:

  • it is free from material mistakes, errors and omissions;
  • the recording of information is adequate, performed in a timely manner and is kept consistent across time;
  • a high level of confidence is placed on the data; and
  • the undertaking must be able to demonstrate that it recognises the data set as credible by using it throughout the undertakings operations and decision-making processes.

Update – In October 2013, following an 18 month consultative process, DAMA UK published a white paper explaining 6 primary data quality dimensions.

1. Completeness
2. Uniqueness
3. Timeliness
4. Validity
5. Accuracy
6. Consistency

For more details see my blog post, Major step forward in Data Quality Measurement


How to deliver a Single Customer View

How to deliver a Single Customer View

How to cost effectively deliver a Single Customer View

Many have tried, and many have failed to deliver a “Single Customer View”.  Well now it’s a regulatory requirement – at least for UK Deposit Takers (Banks, Building Societies, etc.).

The requirement to deliver a Single Customer View of eligible deposit holders indirectly affects every man, woman and child in the UK.  Their deposits, large or small, are covered by the UK Deposit Guarantee Scheme.  This scheme played a key role in maintaining confidence in the banking system during the dark days of the world financial crisis.

UK Deposit Takers must not only deliver the required Single Customer View data, they must provide clear evidence of the data quality processes and controls they use to deliver and verify the SCV data.

The deadline for compliance is challenging.  Plans must be submitted to the regulator by July 2010, and the SCV must be built and verified by Jan 2011.

To help UK Deposit Takers, I have written an E-book explaining how to cost effectively deliver a Single Customer View.  You may download this free from the Dataqualitypro website:

While the document specifically addresses the UK Financial Services Requirement for a Single Customer View, the process steps will help anyone planning a major data migration / data population project.

If you are in any doubt about the need for good data quality management processes to deliver any new system (e.g. Single Customer View, Solvency II, etc.), read the excellent Phil Simon interview on Dataqualitypro about why new systems fail.

Common Enterprise wide Data Governance Issues – #12. No Enterprise wide Data Dictionary.

This post is one of a series dealing with common Enterprise Wide Data Governance Issues.    Assess the status of this issue in your Enterprise by clicking here:  Data Governance Issue Assessment Process

No Idea What This Means

Anyone know what this acronym means?

An excellent series of blog posts from Phil Wright (Balanced approach to scoring data quality) prompted me to restart this series.  Phil tells us that in his organisation, “a large amount of time and effort has been applied to ensure that the business community has a definitive business glossary, containing all the terminology and business rules that they use within their reporting and business processes. This has been published, and highly praised, throughout the organisation.” I wish other organisations were like Phil’s.

Not only do some organisations lack “a definitive business glossary” as Phil describes above, complete with business rules….
Some organisations have no Enterprise wide Data Dictionary.  What is worse – there is no appreciation within senior management of the need for an Enterprise wide Data Dictionary (and therefore no budget to develop one).

Impact(s):

  • No business definition, or contradictory business definitions of the intended content of critical fields.
  • There is an over dependence on a small number of staff with detailed knowledge of some databases.
  • Incorrect or non-ideal sources of required data are identified – because the source of required data is determined by personnel with expertise in specific systems only.
  • New projects, dependent on existing data, are left ‘flying blind’.  The impact is similar to landing in a foreign city, with no map and not speaking the language.
  • Repeated re-invention of the wheel, duplication of work, with associated costs.

Solution:

CIO to define and implement the following Policy:  (in addition to the policies listed for Data Governance Issue #10):

  • An Enterprise wide Data Dictionary will be developed covering critical Enterprise wide data, in accordance with industry best practice.

Does your organisation have an “Enterprise wide Data Dictionary” – if so, how did you achieve it?  If not, how do new projects that depend on existing data begin the process of locating that data?  Please share your experience.

Plug and Play Data – The future for Data Quality

The excellent IAIDQ World Quality Day webinar looked at what the Data Quality landscape might be like in 5 years time, in 2014.  This got me thinking.  Dylan Jones excellent article on The perils of procrastination made me think some more…

Plug and Play Data

Plug and Play Data

I believe that we data quality professionals need a paradigm shift in the way we think about data.  We need to make “Get data right first time” and  “Data Quality By Design” such no brainers that procrastination is not an option.   We need to promote a vision of the future in which all data is reusable and interchangeable – a world of “Plug and Play Data”.

Everybody, even senior business management, understand the concepts of “plug and play” and reusable play blocks.  For “plug and play” to succeed, interconnecting parts must be complete, fully moulded, and conform to clearly defined standards.  Hence “plug and play data” must be complete, fully populated, and conform to clearly defined standards (business rules).

How can organisations “get it right first time” and create “plug and play data”?
It is now relatively simple to invoke cloud based verification from any part of a system through which data enters.

For example, when opening a new “Student” bank account, cloud based verification might prompt the bank assistant with a message like “Mr. Jones’ date of birth suggests he is 48 years old.  Is his date of birth correct?  Is a “Student Account” appropriate for Mr. Jones”?

In conclusion:

We Data Quality Professionals need to educate both Business and IT on the need for, and the benefits of “plug and play data”.   We need to explain to senior management that data is no longer needed or used by only one application.  We need to explain that even tactical solutions within Lines of Business need to consider Enterprise demands for data such as:

  1. Data feed into regulatory systems (e.g Anti Money Laundering, BASEL II, Solvency II)
  2. Access from or data feed into CRM system
  3. Access from or data feed into Business Intelligence system
  4. Ad hoc provision of data to satisfy regulatory requests
  5. Increasingly – feeds to and from other organisations in the supply chain
  6. Ultimate replacement of application with newer generation system

We must educate the business on the increasingly dynamic information requirements of the Enterprise – which can only be satisfied by getting data “right first time” and by creating “plug and play data” that can be easily reused and interconnected.

What do you think?