Data is the new oil – what grade is yours?

Bill Bryson’s book “One Summer: America 1927” provides a fascinating insight into the world of Aviation in the “roaring 20’s”. Aviators were vying to be the first to cross the Atlantic from New York to Paris, a challenge that took many lives, most of which were European.  

Bryson tells us “The American flyers also had an advantage over their European counterparts that nobody yet understood. They all used aviation fuel from California, which burned more cleanly and gave better mileage. No one knew what made it superior because no one yet understood octane ratings – that would not come until the 1930s – but it was what got most American planes across the ocean while others were lost at sea.

Once octane ratings were understood, fuel quality was measured and lives were saved.

We’ve all heard that data is the new oil. To benefit from this “new oil”, you must ensure you use “top grade” only. It can make the difference between business success and failure. It is also a prerequisite for Regulatory compliance, (Solvency II, FATCA, Dodd Frank, Basel III, BCBS 239 etc.). Thankfully, like octane ratings, we know how to measure data quality using 6 primary dimensions: completeness; validity; accuracy; uniqueness; timeliness and consistency. For more details see my post: Major step forward in Data Quality Measurement.

I also explore this topic in my post Russian Gas Pipe and Data Governance.

What happens in your organisation? Do you measure the quality of your most critical data, or do you fly on a wing and a prayer? Please add your comments below.

Major step forward in Data Quality Measurement

How tall are you?
What is the distance between Paris and Madrid?
How long should one cook a 4.5Kg turkey for – and at what temperature?

Quality data is key to a successful business. To manage data quality, you must measure it


We can answer the above questions thanks to “standard dimensions”:

Height: Metres / Feet
Distance: Kilometres / Miles
Time: Hours & Minutes
Temperature: Degrees Celsius / Farenheit

Life would be impossible without the standard dimensions above, even though the presence of “alternate” standards such as metric Vs Imperial can cause complexity.

We measure things for a reason. Based on the measurements, we can make decisions and take action. Knowing our neck size enables us to decide which shirt size to choose. Knowing our weight and our waist size may encourage us to exercise more and perhaps eat less.

We measure data quality because poor data quality has a negative business impact that affects the bottom line.  Rectifying data quality issues requires more specific measurement than anecdotal evidence that data quality is “less than satisfactory”.

The great news is that 2013 marked a major step forward in the agreement of standard dimensions for data quality measurement.

In October 2013, following an 18 month consultative process DAMA UK published a white paper called DAMA UK DQ Dimensions White Paper R3 7.

The white paper lists 6 standard data quality dimensions and provides worked examples. The 6 are:

1. Completeness
2. Uniqueness
3. Timeliness
4. Validity
5. Accuracy
6. Consistency

The dimensions are not new. I referred to 5 of them in a blog post in 2009 There is little understanding among senior management of what “Data Quality” means.
The good news is that this white paper pulls together the thinking of many DQ professionals and provides a full explanation of the dimensions. More importantly, it emphasises the criticality of assessing the organisational impact of poor data quality. I include a quote below:

“Examples of organisational impacts could include:
• incorrect or missing email addresses would have a significant impact on any marketing campaigns
• inaccurate personal details may lead to missed sales opportunities or a rise in customer complaints
• goods can get shipped to the wrong locations
• incorrect product measurements can lead to significant transportation issues i.e. the product will not fit into a lorry, alternatively too many lorries may have been ordered for the size of the actual load
Data generally only has value when it supports a business process or organisational decision making.”

I would like to thank DAMA UK for publishing this whitepaper. I expect to refer to it regularly in my day to day work. It will help me build upon my thoughts in my blog post Do you know what’s in the data you’re consuming?

Hopefully regulators worldwide will refer to this paper when considering data quality management requirements.

Some excellent articles / blog posts / videos referring to this whitepaper include:

Nicola Askham – Data Quality Dimensions

3-2-1 Start Measuring Data Quality ()

Great Data Debate (2) Danger in Dimensions, Kenneth MacKinnon

How do you expect this paper will affect your work? Please share your thoughts. 

Opportunity to apply lessons learnt in my new job

This week I started a new job as Head of Customer Information at Bank of Ireland in Dublin. I am excited at the prospect of applying the lessons I have learnt for the benefit of our customers.

I would like to take this opportunity to thank my fellow data management professionals worldwide for generously sharing their experience with me. I started to write this blog in 2009. My objective was to “Share my experience and seek to learn from the experience of others”. I have certainly learnt from the experience of others, and I hope to continue to do so.

The opinions I express on this blog will continue to be my own. I look forward to continuing to hear yours.

FSA imposes £2.4million fine for inadequate risk reporting systems

London 18th March 2013 – FSA imposes £2.4million fine for inadequate risk reporting systems, which led to a failure to keep investors informed ahead of a profit warning which wiped 57% off the company’s share price. (See London Evening Standard: “Watchdog gets tougher as oil-rig firm Lamprell is fined £2.4 million over stock market breach“).

Oil services group Lamprell is not a bank. However, Lamprell could have avoided this fine, if they had implemented the new BCBS principles for effective risk data aggregation and risk reporting practices (BCBS 239), as published in January 2013; principles, which I describe in a previous post as Data aggregation and reporting principles – applied common sense

I include below some quotes from the article, and in parentheses, the relevant text from the BCBS 239 principles:

  • The FSA said that monthly reports to the board had been totally inadequate for a company of its size and that such reports were delivered late.”
    (Principle 5: Timeliness. Paragraph 44 “A bank’s risk data aggregation capabilities should ensure that it is able to produce aggregate risk information on a timely basis to meet all risk management reporting requirements.”)
  • “It also said the takeover of a rival in 2011, which doubled Lamprell’s size, had left the company using too many different reporting systems.”
    (Principle 1 Governance. Paragraph 29. A bank’s risk data aggregation capabilities and risk reporting practices should be… Considered as part of any new initiatives, including acquisitions and/or divestitures… When considering a material acquisition, a bank’s due diligence process should assess the risk data aggregation capabilities and risk reporting practices of the acquired entity, as well as the impact on its own risk data aggregation capabilities and risk reporting practices. The impact on risk data aggregation should be considered explicitly by the board and inform the decision to proceed. The bank should establish a timeframe to integrate and align the acquired risk data aggregation capabilities and risk reporting practices within its own framework.)

Tracey McDermott, FSA director of enforcement and financial crime, said: “Lamprell’s systems and controls may have been adequate at an earlier stage, but failed to keep pace with its growth. As a result they were seriously deficient for a listed company of its size and complexity, meaning it was unable to update the market on crucial financial information in a timely manner.”

The moral of the story… ensure your organisation, regardless of your industry, applies the common sense set out in: “Data aggregation and reporting principles (BCBS 239) – applied common sense“.

The growing demand for food and data provenance

In November 2012, I presented at the Data Management and Information Quality Europe 2012 conference, in London. My presentation was called Do you know what’s in the data you’re consuming.

In the presentation, I compare the data supply chain with the food supply chain.

I believe that data consumers have the right to be provided with facts about the content of the data they are consuming, just as food consumers are provided with facts about the food they are buying. The presentation provides guidelines on how you can improve your data supply chain.

Little did I realise that within 3 months the term “provenance” would be hitting the headlines due to the European horsemeat scandal.

There’s a silver lining in this food scandal for data quality management professionals. As financial regulators increasingly demand evidence of the provenance of the data provided to them, it is now easier for data quality management professionals to explain to their business colleagues and senior management what “data provenance” means, and what it requires.  Retailers, such as Tesco, must have controls in their supply chain that ensure that the food they sell to consumers only contains “what it says on the tin”. Similarly, financial services organisations providing data to financial regulators must have controls in their data supply chain that ensure the quality of the data they provide can be trusted. Regulators are now asking financial services organisations to demonstrate evidence that their data supply chain can be trusted. They require organisations to demonstrate evidence of their data provenance, as applied to their critical or material data.

But what exactly is “data provenance”? The best definition I have seen comes from Michael Brackett in his excellent book “Data Resource Simplexity“.

“Data Provenance is provenance applied to the organisation’s data resource. The data provenance principle states that the source of data, how the data were captured, the meaning of the data when they were first captured, where the data were stored, the path of those data to the current location, how the data were moved along that path, and how those data were altered along that path must be documented to ensure the authenticity of those data and their appropriateness for supporting the business”.

Enjoy your “beef” burger!

The link between horse meat in beef burgers and data quality management

It was reported today that Horse DNA was detected in tests performed on frozen “beef” burgers in a number of UK and Irish supermarkets. This has come as a shock to consumers, who assumed that quality controls were in place to ensure that food contains only what it says on the label. It appears the quality controls did not include a specific test for the presence of horse meat.

Photograph: Matt Cardy/Getty Images

In an earlier blog post, I asked “Do you know what’s in the data you’re consuming?” In that post, I proposed that, as data consumers, we have the right to expect facts about the business critical data we consume – just as food consumers are provided with nutritional facts. Today’s news reminds us to be clear about the data quality facts we ask for. 

The old adage applies “If you don’t measure, you can’t manage”.

The dog and the frisbee and data quality management

The Wall Street journal reported it as the “Speech of the year“.

In a speech with the intriguing title “The dog and the frisbee“, Andrew Haldane, the Bank of England Director of Financial Stability has questioned whether the Emperor (in the form of ever increasing, ever more complex regulations such as Solvency II, BASEL III and Dodd Frank) is naked. He points out that the BASEL regulations, which have increased from 30 pages to over 600 pages completely failed to identify banks that were at risk of collapse, while a simple measure of the bank’s leverage ratio did identify them.

He also points out “Dodd-Frank makes Glass-Steagall look like throat-clearing.” The Glass-Steagall act of 1933, which separated commercial and investment banking, ran to a mere 37 pages; the Dodd-Frank act of 2010 ran to 848, and may spawn a further 30,000 pages of detailed rule-making by various agencies.

I recommend you read the speech yourself – his arguments, together with his wit are superb. I include a brief extract below:

‘In the UK, regulatory reporting was introduced in 1974. Returns could have around 150 entries. In the Bank of England archives is a memo to George Blunden, who was to become Deputy Governor, on these proposed regulatory returns. Blunden’s handwritten comment reads: “I confess that I fear we are in danger of becoming excessively complicated and that if so we may miss the wood from the trees”.

Today, UK banks are required to fill in more than 7,500 separate cells of data – a fifty-fold rise. Forthcoming European legislation will cause a further multiplication. Banks across Europe could in future be required to fill in 30–50,000 data cells spread across 60 different regulatory forms. There will be less risk of regulators missing the wood from the trees, but only because most will have needed to be chopped down.’

Brilliant !

Andrew Haldene is calling for more simple, basic rules. I agree with him,

I have worked in data management for over 30 years. The challenges I see today are the same challenges that arise time and time again. They are not Solvency II specific, BASEL specific, or Dodd Frank specific. They are universal. They apply to all critical data within all businesses.

The fundamental truth is “The data is unique, but the data management principles are universal”

It is time to stop writing specific data management and data quality management requirements into specific legislation.  Regulators should co-operate with the data management profession, via independent organisations such as DAMA International, to develop a common sense universal standard, and put the effort into improving such a standard.

What do you think? I welcome your comments.