Several Takes on the Notion of “Data Laundering”

Picking up on Sleight of Hand and Data Laundering in Evidence Based Policy Making and Paul Bradshaw’s response that we should maybe follow the data, here’s a quick summary of several competing conceptualisations of “data laundering”.

The first relates to the usage in the sense of “[o]bscuring, removing, or fabricating the provenance of illegally obtained data such that it may be used for lawful purposes” ([SectorPrivate’s] definition of Data Laundering – as inspired by William Gibson from Mona Lisa Overdrive).

SectorPrivate cites this example from a 2005 Privacy conference paper by Thilo Weichert (Privacy and Data Protection in federal police cooperation [original link on Wayback Machine here]):

As working in joint committees naturally is a more informal cooperation, supervision as regards data protection is practically impossible. It is not ensured that personal data transmissions are put down in a protocol being checked. Therefore, it is usually impossible to find out the point of origin of specific information, whether it was obtained lawfully and how its utilisation is limited. In this context one can even talk about data laundering facility: data obtained unlawfully can be passed across the table and be processed without complaints by the receiver in a now cleaned form and can thereupon be passed back.

A more recent reference in ThinkMind // International Journal On Advances in Security, volume 2, numbers 2 and 3, 2009 on Design Patterns for a Systemic Privacy Protection identifies the following:

Problem Situation 4 – Data laundering. Companies are paying a lot of money for personal and group profiles and there are market actors in position to sell them.
This is clearly against data protection principles. This phenomenon is known as ‘data laundering’. Similar to money laundering, data laundering aims to make illegally obtained personal data look as if they were obtained legally, so that they can be used to target customers.

This example is also referred to from an EU Sixth Framework Information Scoiety Technologies deliverable – Safeguards in a World of Ambient Intelligence (SWAMI) Threats, Vulnerabilities and Safeguards in Ambient Intelligence Deliverable D3 3 July 2006 which cites the source as the second SWAMI deliverable. SWAMI-D2 describes the process of data laundering as follows: “Via a large number of transactions and operations, the illegal origin (illegal collection) of personal data can be camouflaged”.

The third deliverable goes on to make the following recommendation:

A means to prevent data laundering could be an obligation imposed on those who buy or
otherwise acquire databases, profiles and vast amounts of personal data, to check diligently
the legal origin of the data. If the buyer does not check the origin and/or the legality of the
databases and profiles, he could be considered equal to a receiver of stolen goods and thus
held liable for illegal data processing. An obligation could also be created which would
require buyers to notify the national data protection officers when personal data(bases) are
acquired. Persons or companies involved or assisting in data laundering could be made
subject to criminal sanctions.

The SWAMI reports thus situate data laundering in the context of invasions privacy and/or contraventions to data protection legislation. State sponsored, rather than evil criminal mafia initiated, usage of illegally acquired data (eg US gov’t data-laundering: using corporate databases to get around privacy law) also falls into a broadly similar area of data protection/privacy law abuse.

The term “data laundering” also appears to have varied usage in the sense of data cleaning (aka data cleansing), (eg Quick and Dirty Data Laundering: A Scalable Solution for Range Checking Data, Data laundering by target rotation in chemistry-based oil exploration).

The sense in which I first came across the term was whilst discussing a data laundry process that could replace metadata records or fields with metadata records in library catalogues that are tainted with commercial license restrictions with data of equivalent of higher quality, known provenance and open license terms (Open Data Processes: the Open Metadata Laundry).

The notion I was going for in Sleight of Hand and Data Laundering in Evidence Based Policy Making is different again. Whilst it shares the SWAMI characterisation insofar as it relates to the practice of removing provenance traces from a data set, it does not assume that the data was acquired illegally and it also differs in the purpose to which the laundered data is applied. In the sense I intended, the data is legal but of low or unverified quality, contains a significant bias, or whose provenance may lead to a conflict of interest arising from the use to which the data is to be put. The laundering is there not to remove traces of the illegal provenance of the data, but to mask the original provenance with a provenance, authority or veneer of quality associated with another agent, such that the data becomes accepted “at face value” with the imprimateur of an independent trusted party. The second part of my take on data laundering was the use to which the laundered data might be put. Specifically, having been laundered of its dubious provenance, and remarqued with a stamp of independent and/or trusted authority, the data would continue to make it’s way through a policy development process with the intention that it would influence the policy decision in favour of the outcome preferred by the agent who insinuated the original data into the data laundering chain.

Compare this with the WIkipedia description of money laundering: “Money laundering often occurs in three steps: first, cash is introduced into the financial system by some means (‘placement’), the second involves carrying out complex financial transactions in order to camouflage the illegal source (‘layering’), and the final step entails acquiring wealth generated from the transactions of the illicit funds (‘integration’).”

I would contend that there are thus several different sorts of data malpractice that we might term as data laundering and that one of the tasks facing a Fourth Estate might be to clarify and chase down these various abuses of process whether they occur in the corporate world, academia, the public sector or in government itself.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

One thought on “Several Takes on the Notion of “Data Laundering””

  1. Here’s another reference: “Buying You: The Government’s Use of Fourth-Parties to Launder Data about ‘The People'”, Joshua L. Simmons (Kirkland & Ellis LLP), September 19, 2009, Columbia Business Law Review, Vol. 2009, No. 3, p. 950 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1475524
    “The CIA, FBI, Justice Department, Defense Department, and other government agencies are at this very moment turning to a group of companies to provide them information that these companies can gather without the restrictions that bind government intelligence agencies. The information is gathered from sources that few would believe the government could gain unfettered access to, but which, under current Fourth Amendment doctrine and statutory protections, are completely accessible.
    “Fourth-parties, such as ChoicePoint or LexisNexis, are private companies that aggregate data for the government, and they comprise the private security-industrial complex that arose after the attacks of September 11, 2001. They are in the business of acquiring information, not from the information’s originator (the first-party), nor from the information’s anticipated recipient (the second-party), but from the unavoidable digital intermediaries that transmit and store the information (third-parties). These fourth-party companies act with impunity as they gather information that the government wants but would be unable to collect on its own due to Fourth Amendment or statutory prohibitions. This paper argues that when fourth-parties disclose to law enforcement information generated as a result of searches that would be violations had the government conducted the searches itself, those fourth-parties’ actions should be considered searches by agents of the government, and the data should retain privacy protections.”

Comments are closed.