Notes on Data Quality…

You know how it goes – you start trying to track down a forward “forthcoming” reference and you end up wending your way through all manner of things until you get back to where you started none the wiser… So here’s a snapshot of several docs I found trying to source the original forward reference for following table, found in Improving information to support decision making: standards for better quality data (November 2007, first published October 2007) with the crib that several references to it mentioned the Audit Commission…

datainfoKnowledge

The first thing I came across was The Use of Information in Decision Making – Literature Review for the Audit Commission (2008), prepared by Dr Mike Kennerley and Dr Steve Mason, Centre for Business Performance Cranfield School of Management, but that wasn’t it… This document does mention a set of activities associated with the data-to-decision process: Which Data, Data Collection, Data Analysis, Data Interpretation, Communication, Decision making/planning.

The data and information definitions from the table do appear in a footnote – without reference – in Nothing but the truth? A discussion paper from the Audit Commission in Nov 2009, but that’s even later… The document does, however, identify several characteristics (cited from an earlier 2007 report (Improving Information, mentioned below…), and endorsed at the time by Audit Scotland, Northern Ireland Audit Office, Wales Audit Office and CIPFA, with the strong support of the National Audit Office), that contribute to a notion of “good quality” data:

Good quality data is accurate, valid, reliable, timely, relevant and complete. Based on existing guidance and good practice, these are the dimensions reflected in the voluntary data standards produced by the Audit Commission and the other UK audit agencies
* „Accuracy – data should be sufficiently accurate for the intended purposes.
* Validity – data should be recorded and used in compliance with relevant requirements.
„* Reliability – data should reflect stable and consistent data collection processes across collection points and over time.
„* Timeliness – data should be captured as quickly as possible after the event or activity and must be available for the intended use within a reasonable time period.
„* Relevance – data captured should be relevant to the purposes for which it is used.
„* Completeness – data requirements should be clearly specified based on the information needs of the body and data collection processes matched to these requirements.

The document also has some pretty pictures, such as this one of the data chain:

datachain

In the context of the data/information/knowledge definitions, the Audit Commission discussion document also references the 2008 HMG strategy document Information matters: building government’s capability in managing knowledge and information which includes the table in full; a citation link is provided, but 404s, but a source is given to the November 2008 version of Improving information, the one we originally started with. So the original reference forward refers the table to an unspecified report, but future reports in the area refer back to that “original” without making a claim to the actual table itself?

Just in passing, whilst searching for the Improving information report, I actually found another version of it… Improving information to support decision making: standards for better quality data, Audit Commission, first published March 2007.

twireports

The table and the definitions as cited in Information Matters do not seem to appear in this earlier version of the document?

PS Other tables do appear in both versions of the report. For example, both the March 2007 and November 2007 versions of the doc contain this table (here, taken from the 2008 doc) of stakeholders:

stakeholders

Anyway, aside from all that, several more documents for my reading list pile…

PS see also Audit Commission – “In the Know” from February 2008.

2 comments

  1. David Chapman

    So the ‘forthcoming’ document was probably never forthcoming. Could be quite a handy trick, I guess. In document A reference something in forthcoming document B. You never produce document B but a future document C can reference document A, and document B is not need.

    That ‘data chain’ is pretty, but I’m not sure what story it is trying to tell. What is that ‘informs’ arrow doing?

    • Tony Hirst

      Simple feedback loop – having tried to make sense of the data you collected, you realise how you should have collected it/what you should have collected, so you can do it better next time…;-)