The Closed Route to Open Data
A couple of weeks ago, I gave a presentation to the WebScience students at the University of Southampton on the topic of open data, using it as an opportunity to rehearse a view of open data based on the premise that it starts out closed. In much the same way that Darwin’s Theory of Evolution by Natural Selection is based on a major presupposition, specifically a theory of inheritance and the existence of processes that support reproduction with minor variation, so too does much of our thinking about open data derive from the presupposed fact that many of the freedoms we associate with the use of open data in legal terms arise from license conditions that the “owner” of the data awards to us.
Viewing data in this light, we might start by considering what constitutes “closed” data and how it comes to be so, before identifying the means by which freedoms are granted and the data is opened up. (Sometimes it can also be easier to consider what you can’t do than what you can, especially when answers to questions such as “so what can you actually do with open data?” attract the (rather meaningless) response: “anything”. We can then contrast what you can do in terms of freedom complementary to what you can’t…)
So how can data be “closed”?
One lens I particularly like for considering constraints that are placed on actions and actors, particularly in the digital world (although we can apply the model elsewhere) I first saw described by Lawrence Lessig in Code and Other Laws of Cyberspace: What Things Regulate: A Dot’s Life.
Here’s the dot and the forces that constrain its behaviour:
So we see, for example, the force of law, social norms, the market (that is, economic forces) and architecture, that is the “digital physical” way the world is implemented. (Architecture may of course be designed in order to enforce particular laws, but it is likely that other “natural laws” will arise as a result of any particular architecture or system implementation.)
Without too much thought, we might identify some constraints around data and its use under each of these separate lenses. For example:
- Law: copyright and database right grant the creator of a dataset certain protective rights over that data; data protection laws (and other “privacy laws”) limit access to, or disclosure of, data that contains personal information, as well as restricting the use of that data for purposes disclosed at the time it was collected. The UK Data Protection Act also underwrites the right of individuals to claim additional limits on data use, for example the rights “to object to processing that is likely to cause or is causing damage or distress to prevent processing for direct marketing; to object to decisions being taken by automated means” (ICO Guide to the DPA, Principle 6 – The rights of individuals).
- Norms: social mores, behaviour and taboos limit the ways in which we might use data, even if that use is not constrained by legal, economic or technical concerns. For example, applications that invite people to “burgle my house” based on analysing social network data to discover when they are likely to be away from home and what sorts of valuable product might be on the premises are generally not welcomed. Norms of behaviour and everyday workpractice also mean that much data is not published when theere are no real reasons why it couldn’t be.
- Market: in the simplest case, charging for access to data places a constraint on who can gain access to the data even in advance of trying to make use of it. If we extend “market” to cover other financial constraints, there may be a cost associated with preparing data so that it can be openly released.
- Architecture: technical constraints can restrict what you can do with data. Digital rights management (DRM) uses encryption to render data streams unusable to all but the intended client, but more prosaically, document formats such as PDF or the “release” of data charts are flat image files makes it difficult for the end user to manipulate as data any data resources contained in those documents.
Laws can also be used to grant freedoms where freedoms are otherwise restricted. For example:
- the Freedom of Information Act (FOI) provides a mechanism for requesting copies of datasets from public bodies; in addition, the Environmental Information Regulations “provide public access to environmental information held by public authorities”.
- the laws around copyright relax certain copyright constraints for the purposes of criticism and review, reporting, research, teaching (IPO – Permitted uses of copyright works);
- in the UK, the Data Protection Act provides for “a right of access to a copy of the information comprised in their personal data” (ICO Guide to the DPA, Principle 6).
- in the UK, the Data Protection Act regulates what can be done legitimately with “personal” data. However, other pieces of legislation relax confidentiality requirements when it comes to sharing data for research purposes. For example:
- the NHS Act s. 251 Control of patient information; for example, the Secretary of State for Health may “make regulations to set aside the common law duty of confidentiality for medical purposes where it is not possible to use anonymised information and where seeking individual consent is not practicable” (discussion). Note that they are changes afoot regarding s. 251…
- The Secretary of State for Education has specific powers to share pupil data from the National Pupil database (NPD) “with named bodies and third parties who require access to the data to undertake research into the educational achievements of pupils”. The NPD “tracks a pupil’s progress through schools and colleges in the state sector, using pupil census and exam information. Individual pupil level attainment data is also included (where available) for pupils in non-maintained and independent schools” (access arrangements).
- the Enterprise and Regulatory Reform Bill currently making its way through Parliament legislates around the Supply of Customer Data (the “#midata” clauses) which is intended to open up access to customer transaction data from suppliers of energy, financial services and mobile phones “(a) to a customer, at the customer’s request; (b) to a person who is authorised by a customer to receive the data, at the customer’s request or, if the regulations so provide, at the authorised person’s request.” Although proclaimed as a way of opening up individual rights to access this data, the effect will more likely see third parties enticing individuals to authorise the release to the third party of the individual first party’s personal transaction data held by a second party (for example, #Midata Is Intended to Benefit Whom, Exactly?). (So you’ll presumably legally be able to grant Facebook access to your mobile phone records… Or Facebook will find a way of getting you to release that data to them without you realising you granted them that permission;-)
Contracts (which I guess fall somewhere between norms and laws from the dot’s perspective (I need to read that section of Lessig’s book again!) can also be used by rights holders to grant freedoms over the data they hold the rights for. For example, the Creative Commons licensing framework provides a copyright holder with a set of tools for relaxing some of the rights afforded to them by copyright when they license the work accordingly.
Note that “I am not a lawyer”, so my understanding of all this is pretty hazy;-) I also wonder how the various pieces of legislation interact, and whether there are cracks and possible inconsistencies between them? If there are pieces of legislation around the regulation and use of data that I’m missing, please post links in the comments below, and I’ll try and do a more thorough round up in a follow on post.