Several Takes on the Notion of “Data Laundering”
Picking up on Sleight of Hand and Data Laundering in Evidence Based Policy Making and Paul Bradshaw’s response that we should maybe follow the data, here’s a quick summary of several competing conceptualisations of “data laundering”.
The first relates to the usage in the sense of “[o]bscuring, removing, or fabricating the provenance of illegally obtained data such that it may be used for lawful purposes” ([SectorPrivate's] definition of Data Laundering – as inspired by William Gibson from Mona Lisa Overdrive).
SectorPrivate cites this example from a 2005 Privacy conference paper by Thilo Weichert (Privacy and Data Protection in federal police cooperation):
As working in joint committees naturally is a more informal cooperation, supervision as regards data protection is practically impossible. It is not ensured that personal data transmissions are put down in a protocol being checked. Therefore, it is usually impossible to find out the point of origin of specific information, whether it was obtained lawfully and how its utilisation is limited. In this context one can even talk about data laundering facility: data obtained unlawfully can be passed across the table and be processed without complaints by the receiver in a now cleaned form and can thereupon be passed back.
A more recent reference in ThinkMind // International Journal On Advances in Security, volume 2, numbers 2 and 3, 2009 on Design Patterns for a Systemic Privacy Protection identifies the following:
Problem Situation 4 – Data laundering. Companies are paying a lot of money for personal and group proﬁles and there are market actors in position to sell them.
This is clearly against data protection principles. This phenomenon is known as ‘data laundering’. Similar to money laundering, data laundering aims to make illegally obtained personal data look as if they were obtained legally, so that they can be used to target customers.
This example is also referred to from an EU Sixth Framework Information Scoiety Technologies deliverable – Safeguards in a World of Ambient Intelligence (SWAMI) Threats, Vulnerabilities and Safeguards in Ambient Intelligence Deliverable D3 3 July 2006 which cites the source as the second SWAMI deliverable. SWAMI-D2 describes the process of data laundering as follows: “Via a large number of transactions and operations, the illegal origin (illegal collection) of personal data can be camouflaged”.
The third deliverable goes on to make the following recommendation:
A means to prevent data laundering could be an obligation imposed on those who buy or
otherwise acquire databases, profiles and vast amounts of personal data, to check diligently
the legal origin of the data. If the buyer does not check the origin and/or the legality of the
databases and profiles, he could be considered equal to a receiver of stolen goods and thus
held liable for illegal data processing. An obligation could also be created which would
require buyers to notify the national data protection officers when personal data(bases) are
acquired. Persons or companies involved or assisting in data laundering could be made
subject to criminal sanctions.
The SWAMI reports thus situate data laundering in the context of invasions privacy and/or contraventions to data protection legislation. State sponsored, rather than evil criminal mafia initiated, usage of illegally acquired data (eg US gov’t data-laundering: using corporate databases to get around privacy law) also falls into a broadly similar area of data protection/privacy law abuse.
The term “data laundering” also appears to have varied usage in the sense of data cleaning (aka data cleansing), (eg Quick and Dirty Data Laundering: A Scalable Solution for Range Checking Data, Data laundering by target rotation in chemistry-based oil exploration).
The sense in which I first came across the term was whilst discussing a data laundry process that could replace metadata records or fields with metadata records in library catalogues that are tainted with commercial license restrictions with data of equivalent of higher quality, known provenance and open license terms (Open Data Processes: the Open Metadata Laundry).
The notion I was going for in Sleight of Hand and Data Laundering in Evidence Based Policy Making is different again. Whilst it shares the SWAMI characterisation insofar as it relates to the practice of removing provenance traces from a data set, it does not assume that the data was acquired illegally and it also differs in the purpose to which the laundered data is applied. In the sense I intended, the data is legal but of low or unverified quality, contains a significant bias, or whose provenance may lead to a conflict of interest arising from the use to which the data is to be put. The laundering is there not to remove traces of the illegal provenance of the data, but to mask the original provenance with a provenance, authority or veneer of quality associated with another agent, such that the data becomes accepted “at face value” with the imprimateur of an independent trusted party. The second part of my take on data laundering was the use to which the laundered data might be put. Specifically, having been laundered of its dubious provenance, and remarqued with a stamp of independent and/or trusted authority, the data would continue to make it’s way through a policy development process with the intention that it would influence the policy decision in favour of the outcome preferred by the agent who insinuated the original data into the data laundering chain.
Compare this with the WIkipedia description of money laundering: “Money laundering often occurs in three steps: first, cash is introduced into the financial system by some means (‘placement’), the second involves carrying out complex financial transactions in order to camouflage the illegal source (‘layering’), and the final step entails acquiring wealth generated from the transactions of the illicit funds (‘integration’).”
I would contend that there are thus several different sorts of data malpractice that we might term as data laundering and that one of the tasks facing a Fourth Estate might be to clarify and chase down these various abuses of process whether they occur in the corporate world, academia, the public sector or in government itself.