A few more bits and pieces around the possible distribution and application of open public data (that is, openly licensed data released by public bodies):
- Bills before Parliament – Education (Information Sharing) Bill 2013-14: although this is a private member’s bill, explanatory notes have been prepared by prepared by the Department for Education. The bill allows for “student information of a prescribed description” to be made available to a “prescribed person” or “a person falling within a prescribed category”. If the bill goes through, keeping tabs on these prescriptions will be key to seeing how this might play out.
As mentioned in my Rambling Round-Up of Some Recent #OpenData Notices from August, the HMRC is consulting on opening up access to VAT records. And through the post this week, I received a letter from the NHS regarding the sharing of data within the NHS via Summary Care Records, although this appears to be more to do with data sharing within the NHS on a case-by-case basis, rather than sharing of bulk datasets for analysis/research and/or business development. So outbreaks of planned sharing are appearing all over the place. I’m not sure what the best way of tracking such initiatives is though?
I haven’t really been tracking private members’ bills either (except the Supermarket Pricing Information Bill 2012-13 that never went anywhere!), and I’m not really sure what they signal, but some of them do make me a bit twitchy. Like the currently proposed Collection of Nationality Data Bill that will “require the collection and publication of information relating to the nationality of those in receipt of benefits and of those to whom national insurance numbers are issued.” Or the Face Coverings (Prohibition) Bill 2013-14, whereby “a person wearing a garment or other object intended by the wearer as its primary purpose to obscure the face in a public place shall be guilty of an offence.” As discussions regarding privacy and anonymity on the web ebb and flow, it’s interesting to see how they’re tracked “IRL”. If a space is public, do you have any right to privacy or anonymity?
- ESRC Pre-call: Business and Local Government Data Research Centres – Big Data Network Phase 2:
The ESRCs Big Data Network will support the development of a network of innovative investments which will strengthen the UKs competitive advantage in Big Data. The core aim of this network is to facilitate access to different types of data and thereby stimulate innovative research and develop new methods to undertake that research. This network has been divided into three phases.
- Phase 1 of the Big Data Network the ESRC has invested in the development of the Administrative Data Research Network (ADRN) which will provide access to de-identified administrative data collected by government departments for research use
- Phase 2, which is the focus of this pre-announcement, will focus primarily on business data and local government data
- Phase 3, further details of which will be released in the Autumn, will focus primarily on third sector data and social media data
- Progress continues on the smart meter roll out program, with huge chunks of money being lined up for a few lucky companies (Government Selects Favourites For The Smart Meter Roll-Out). See also the Energy and Climate Change Select Committee inquiry – “Smart meter roll-out” and their Smart meter roll out report. Whilst the drivers are presumably supposedly related more efficient energy management, there are plenty of surveillance opportunities arising! Whilst not public data, as such, the availability (and sharing with data aggregators) of smart meter data does form part of the government’s #midata programme (around which the current strategy appears to be “the less said the better”…)
- Maybe of interest to hardcore openspending data geeks, Local Audit and Accountability Bill 2013-14 has made its way from the Lords into the Commons. Schedule 9 introduces regulations around data matching, described as “an exercise involving the comparison of sets of data to determine how far they match (including the identification of any patterns and trends)”, although “data matching exercise[s] may not be used to identify patterns and trends in an individual’s characteristics or behaviour which suggest nothing more than the individual’s potential to commit fraud in the future”. A code of practice is also required. The power “is exercisable for the purpose of assisting in the prevention and detection of fraud” although the schedule may be amended in order to assist: “a) in the prevention and detection of crime (other than fraud), (b) in the apprehension and prosecution of offenders, and (c) in the recovery of debt owing to public bodies”.
Schedule 11 covers the Disclosure of Information. Where an auditor obtains information from a public body “[a] local auditor, or a person acting on the auditor’s behalf, may also disclose information to which this Schedule applies except where the disclosure would, or would be likely to, prejudice the effective performance of a function imposed or conferred on the auditor by or under an enactment”. I’m not sure to what extent such information might be requestable from the local auditor though?
I have to admit, I’m losing track of all these data and information related laws. And I guess I should also admit that I don’t really understand what any of them actually mean, either…!;-)
When faced with a car parking charge of £1.90 and a “no change” ticket machine, how much do we actually end up paying?
A recent report on English Local Authority Parking Finances by the RAC Foundation reviews the surpluses made by local councils when comparing the revenue they generate from local parking and traffic enforcement notice charges and the costs associated with providing those services. Across all the English councils, it seems to amount to £412 million for the most recently reported on period, the financial year 2011-2012. From the reported figures, income of £1,371 million is generated with costs of £806 million and a surplus of £565 million, a gross margin of 41.2%.
Presumably in an attempt to make a better story for unwary journalists making back of the envelope percentage calculations, the report describes how councils “collect around £1.4 billion [rounding up from £1,371 million] from parking tickets, permits and penalties, spend around £0.8 billion [rounding down, slightly, from £806 million] and make a surplus of £0.6 billion [rounding up from £565 million]”. The gross margin calculation using these numbers is 0.6/1.4 * 100% = 42.86%, which we might typically round up to 43%, compared to the proper rounding of the original amount, which would be 41%.
41% is still a great rate of return, of course! But is it fair? In written evidence to the current House of Commons Transport Select Committee on local authority parking enforcement, the RAC Foundation noted that “There is evidence that official guidance to TMA 2004 [Operational Guidance to Local Authorities: Parking Policy and Enforcement] on parking charges is not strictly adhered to, and that councils set parking charges with the likelihood of them realising a surplus. It should be clear to all local authorities that they have no legal powers to set parking charges at a higher level than that needed to achieve the objective of relieving or preventing congestion of traffic.”
Referring to the guidance itself, we see that setting the price of parking is something of a dark art that can use consumer psychology to influence behaviour in support of a particular transport policy.
4.8 When setting charges, authorities should consider the following factors:
- parking charges can help to curb unnecessary car use where there is adequate public transport or walking or cycling are realistic alternatives, for example, in town centres;
- charges can reflect the value of kerb-space, encouraging all but short-term parking to take place in nearby off-street car parks where available. This implies a hierarchy of charges within a local authority area, so that charges at a prime parking space in a busy town centre would normally be higher than those either at nearby off-street car parks or at designated places in more distant residential areas. Such hierarchies should be as simple as practicable and applied consistently so that charge levels are readily understandable and acceptable to both regular and occasional users;
- charges should be set at levels that encourage compliance with parking restrictions. If charges are set too high they could encourage drivers to risk non-compliance or to park in unsuitable areas, possibly in contravention of parking restrictions. In certain cases they could encourage motorists to park in a neighbouring local authority area which may not have the capacity to handle
the extra vehicles. In commercial districts this may have a negative impact on business in the area; and
- if on-street charges are set too low, they could attract higher levels of traffic than are desirable. They could discourage the use of off-street car parks and cause the demand for parking spaces to exceed supply, so that drivers have to spend longer finding a vacant space.
Balancing these policy objectives against claims that the level of surplus being generated is unfair is something that each council needs to justify to its own constituents. When making such a justification, it would seem likely that representation could be made on several different levels – by considering overall revenues, costs and surpluses; by looking at the occupancy volumes or rates for different car parking spaces; or at the level of actual car parking tariffs (that is, how much it costs to park for an hour in a particular location).
Most of us feel the pain at the everyday level, of course, when it actually comes to actually finding and paying for car parking. But are we paying more than we need to, nudged into contributing to additional surpluses over and above what a quick calculation based on parking volumes and tariffs (that is, charges for parking) might suggest is the “planned” surplus? I thought I’d put my data sleuth hat on to try and find out how much extra money could be made by not providing change…
Take my local council, for example, on the Isle of Wight. The main civic car park in the charming harbour town of Yarmouth has a range of ticket prices, including a £1.90 rate for stays between one and two hours, and a £3.40 rate for durations between two and four hours. The two ticket machines are both cash based and don’t offer change. Many retailers know that pricing goods at £something.99 helps encourage sales, although how psychological pricing tricks like this actually work is still open to debate. (For more on the psychology of pricing, see the OFT commissioned report on Pricing Practices: Their Effects on Consumer Behaviour and Welfare.) In a “no change” payment setting, might we use related psychological tricks in association with the value of our coinage (1p, 2p, 5p, 10p, 20p, 50p, £1) to apparently set one price, which we must defend, whilst on average expecting the payment of a larger amount? That is, might we choose a £1.90 price point in the expectation that we might actually make £2 on many of the transactions?
Using data acquired via a Freedom of Information request, I asked the Isle of Wight council for the number of tickets issued within each price band for the Yarmouth town car park during 2012/13, along with the revenue generated by each of the two ticket machines. Using this information, we can calculate how much additional revenue is generated for each price band based on overpayments:
In the grander scheme of things, this doesn’t amount to a huge sum of money (the total overpayments come to £2272.15, or 1.7% of the total revenue), though it must be remembered that this refers to just a single car park in a single local council area.
If we look at the raw data that details the actual payment made for each ticket issued by the ticket machines at the £1.90 tariff level, we can see how many people actually overpay:
Actual Payment (£) Count 1.9 10237 1.95 19 2 7734 2.05 6 2.1 16 2.15 1 2.2 16 2.3 7 2.35 1 2.4 22 2.5 39 2.55 1 2.6 7 2.7 7 2.75 1 2.8 2 2.9 11 2.95 1 3 134 3.05 2 3.1 3 3.2 18 3.3 10 3.35 3
Of the 18,298 tickets issued at the £1.90 level for the Yarmouth town car park during financial year 2012/13, it would appear that over 40% of the tickets issued generated £2 in revenue, presumably because drivers didn’t have the exact change to hand.
Whilst it would be easy enough to exclaim “We can only guess at how much money extra money English councils raise in this way”, that’s not strictly true. We could find out exactly by making FOI requests to them all…
Investigations such as this often raise more questions than they answer. For example: what parking tariff bands does your local council use? How much overpayment are you “happy” to make for your car parking ticket? If there were increases in charges from an amount such as £1.40 to £1.60, what might that have done for actual revenues raised within that price band? If you start exploring this topic in your local area, please let me know via the comments:-)
PS see also this Telegraph article on Academic finds link between parking tickets and wardens’ overtime.
With my growing unease about just what the agenda driving open government/public data is, I think I’m going to have to find some time away to walk the dog lots, and mull over what pieces might be part of the jigsaw, as well as having a go at trying to put some of them together…
Near the top of the list is a concern about information asymmetry and how open data may be used by private concerns to provide a one-off advantage for them when it comes to poaching services from the public sector. How so? My gut reaction thinking is this: if, as part of the procurement process, the private sector can use open public data to help it secure a contract in competition with a public sector provider, then when contracts come to renewal the public sector may know less when it comes to bid than the private sector company was able to learn when it first tendered. The question here is: does open public data put private sector companies in an advantage when it comes to bidding for public service contracts against an encumbent public provider compared to a public body bidding to recapture a service from an encumbent private provider, given that the private provider may not be required to open up information (for example, through FOI requests, transparency or public reporting obligations) in the same way that a public body is.
Another take on a similar theme is the extent to which there may be a loss of transparency when a service goes from a public to a private provider. If we think there is some benefit to be had from transparency in general terms, then private providers of public services should have the same openness requirements placed on them as the public body. If private companies can claim revealing information is against their commercial interest, can public bodies make the same claims on exactly the same terms under FOI exemption rules, for example (eg MoJ Freedom of information guidance: Exemptions guidance – Section 43: Commercial interests).
Taking the NHS as a case example, here are a few things on my reading list:
- Monitor report from March 2013 on A fair playing field for the benefit of NHS patients [actual report]. For example, the report identified the following distortions:
1. Participation distortions. Some providers are directly or indirectly excluded from offering their services to NHS patients for reasons other than quality or efficiency. Restrictions on participation disadvantage providers seeking to expand into new services or new areas, regardless of whether the providers are public, charitable or private. Participation distortions disadvantage nonincumbent providers of every type.
2. Cost distortions. Some types of provider face externally imposed costs that do not fall on other providers. On balance, cost distortions mostly disadvantage charitable and private health care providers compared to public providers.
3. Flexibility distortions. Some providers’ ability to adapt their services to the changing needs of patients and commissioners is constrained by factors outside their control. These flexibility distortions mostly disadvantage public sector providers compared to other types.
I’m not sure to what extent, if any, the report reviews distortions and asymmetries arising from open data issues.
A search of the report for mentions of FOI turns up:
Historically, public providers have faced higher levels of scrutiny than other providers, including requests for information under the Freedom of Information Act. This degree of scrutiny can improve accountability to patients and promote good practice. Freedom of Information requirements have been extended through the standard NHS contract to private and charitable providers. However, it is not clear that this is operating effectively as yet, and other aspects of transparency do not apply across all types of provider.
29. The Government and commissioners should ensure that transparency, including Freedom of Information requirements, is implemented across all types of provider of NHS services on a consistent basis.
As I said, it’s on the reading list…
- A terrifying post on the Computer Weekly/Public Sector IT blog – NHS watchdog commandeers data in bid to stimulate privatization and an earlier one on the naive take on hospital mortality data: Data regime makes merciless start on NHS privatization. Are there any reports or strategy documents from the Care Quality Commission (CQC) I need to add to my reading list?
- Something academic… such as this piece from the Proceedings of the 21st European Conference on Information Systems on The Generative Mechanisms of Open Government Data, much of which I suspect is summarised by these two figures taken (without permission) from the the paper:
- Opening up data (particularly data held by public bodies) around private companies is another area I can quite get my head round, particularly when it comes to comparing information about the machinations of private companies as compared to public bodies. To what extent should companies that are public and limited liability have data that is held by them by public bodies be openly available, for example? Maybe related to this is a currently open BIS consultation: Company ownership: transparency and trust discussion paper as well as the HMRC consultationon Sharing and publishing data for public benefit (press release) that I linked to from yesterday’s Rambling Round-Up of Some Recent #OpenData Notices. (OpenCorporate’s Chris Taggart posts some interesting thoughts on the sheen given to the proposed release of VAT registration data to credit agencies that the consultation is in part based around: Open tax data, or just VAT ‘open wash’.) A recent edition of File on 4 (h/t @onthewight) on charity based tax fraud – Faith, Hope and… Tax Avoidance – also got me wondering further about what information is openly available about charities’ activities (eg A Quick Peek at Some Charities Data…)?
- One to dig for… via Lexology, a post on Freedom of information in the private sector? which claims that “The Confederation of British Industry (“CBI“) has revealed that it is developing ‘transparency guidelines’ that will apply to private companies that provide services to the NHS.” Have these appeared yet, even in draft or consultation form?
A few other things on my to-do list in this area: map out the lobbiests and board/panel members around open data; use disclosure logs to search for companies putting in FOI requests in different sectors; see who’s pitching ideas in to ODUG; map out who’s funding NGOs and activities in the opendata space.
Sigh… no time…;-)
PS Not sure if there is a full paper version of this…? Bates, J. 2013, Information policy in the crises of neoliberalism: the case of Open Government Data in the UK at International Association of Media and Communications Researchers Conference, Dublin, June 2013: “Whilst open data releases by the UK government have received substantial support within UK civil society, often being interpreted as a creative and innovative response to a range of social issues, and, for some, a radical challenge to key components of neoliberal capitalism, this paper argues that deeper analysis of the OGD initiative suggests that it is being shaped by the UK government and corporate interests in an attempt to leverage a distinctly neoliberal agenda. The adoption and development of the OGD agenda as core to the policy response adopted by the UK Government to conditions of political economic crisis, suggests that information policy is being implemented as a key, yet often opaque, element of the neoliberal policy toolbox.” See also an earlier paper, “This is what modern deregulation looks like”: Co-optation and contestation in the shaping of the UK’s Open Government Data Initiative (“whilst OGD [Open Government Data] might potentially support modes of transparent and democratic governance, the current ‘transparency agenda’ should be recognised as an initiative that also aims to enable the marketisation of public services, and this is something that is not readily apparent to the general observer.”) and a statement of Jo Bates current research project in the are: The politics of Open Government Data in the UK
PPS Supporting the idea of symmetry in reporting between public services and private companies delivering public services, Richard Murphy on Making public services accountable. And some excellent writings critiquing computational thinking and the teaching of code by Ben Williamson.
It’s been some time since I (b)logged recent reports and announcements relating to the ongoing evolution of the open data thang in the UK, so here’s a quick round-up of some of the things I have floating in my open tabs…
- Secretary of State’s Code of Practice (datasets) on the discharge of public authorities’ functions under Part 1 of the Freedom of Information Act – I guess this is the big one, the latest code of practice relating to the release of datasets under FOI. (Owen Boswarva has also compared it to the consulted upon draft.) The ICO give a quick overview as well as a specialist guidance on Datasets (FOI sections 11, 19 & 45). A sceptic might say it looks like FOIable bodies have also been given the wherewithal to set up their own data trading funds, enabled by The Freedom of Information (Release of Datasets for Re-use) (Fees) Regulations 2013. In passing, this looks like a handy place to catch up on FOI round-ups.
- Via Out-Law, HMRC consults on plans to release anonymised tax datasets, I notice that HMRC has a new consultation out: Sharing and publishing data for public benefit. Apparently, [t]his consultation brings forward three options:
- wider sharing of aggregated and anonymised tax data, for example, for the purposes of research or policy development;
- release of basic non-financial VAT registration data as public data; and
- sharing more detailed VAT registration data on a more restricted and controlled basis for specific purposes, such as credit referencing.
At first glance, section 2.4 read to me like the hatchets are out on the NHS and the marshalling of resources to drive its privatisation continues ever onwards. From the consultation, I noticed that a Tax Sector Transparency Board was set up at the end of last year, which brings the number of sector transparency/data boards to about 437, I think? (Try searching for site:gov.uk inurl:sector-transparency-board.)
In passing, I also note this response to an FOI request from 2010 in relation to accessing company VAT number data:
I believe that disclosure of a complete list of VAT numbers currently in use would be likely to prejudice the prevention or detection of crime and the assessment or collection of VAT. I have reached this conclusion as I believe that the requested information could be used by opportunistic individuals and fraudsters to hijack genuine VAT numbers in order to fraudulently present themselves to HMRC, to other traders or to prospective customers as VAT registered. VAT is charged when a VAT-registered business sells to either another business or to a non-business customer. When VAT-registered businesses buy goods or services they can generally reclaim the VAT they have paid. If fraudsters are able to charge or reclaim VAT when they are not entitled to do so, then this will result in loss to the Public Purse and to members of the public who fall victim to such fraud.
Section 31 is a qualified exemption which means that, if it applies, I must consider whether it is the public interest to override the exemption and release the information. I have very carefully considered this but have decided that on balance it is not in the public interest to release this information.
- The ICO is also in consulting mode, running a Consultation on the “Conducting privacy impact assessments” code of practice. The consultation isn’t posted on page with a sensible URL though, it’s linked via the “current consultations” page, so if you’re reading this a month or two after the time of writing, you’ll probably need to look in the closed consultations area of the site. Go figure…
- I had a little play with FOI myself recently. G4S was in the news again in a minor scandal about overcharging the Ministry of Justice on tagging contracts. I thought I’d have a peek at the MoJ spending data with respect to G4S, but they’ve been slipping in their transparency duties, so I felt obliged to FOI the spending data. No reply as yet – and I’m not sure if the data has gone up via their transparency pages yet, either?
- I’ve recently started picking up on the creation of research panels and government department data labs. For example, the HMRC datalab and the more recent Justice data lab, which looks like an interesting resource for charities and other agencies working in the justice sector who need to demonstrate impact… HEFCE are also trying to open up access, sort of, to student survey data by means of the National Student Survey research panel. I suspect that the NHS (and the DfE, eg via the National Pupil Database) have data access initiatives, as well as data linkage services? For example, the Linked Hospital Episode Statistics and Mental Health Minimum Data Set. In the academic health research area, see also the Expert Advisory Group on Data Access.
- A handful of recent reports on how open data is perceived and being used: from Sciencewise, a June 2013 report on Public views on open data; from the Department for Work and Pensions (DWP), a couple of brief reports on “how DWP uses transparency and open data to improve public services and accountability”. Alternatively, for a few thousand dollars, you can get a Forrester Research report on Getting The Most Out Of Open Data. One of the opendata “success” stories I heard championed most recently was by Sir Nigel Shadbolt at the Guardian Activate Summit. Apparently, the release of spending data has resulted in the “success” of private companies selling procurement advice based on an analysis of the data back to the public bodies, though I don’t think any specifics were mentioned. Are there any papers out there looking at how open data is being used to drive privatisation and destroy public services, I wonder?
- More research centre initiatives, from another report that I missed when it came out in December 2012 – The UK Administrative Data Research Network: Improving Access for Research and Policy. It would be interesting to see how the models proposed in this report compare to the structures used by the government datalabs?
And finally, an even older report I’d not picked up on before. From the Audit Commission in March 2010, a discussion paper: The Truth is Out There – Transparency in an Information Age. I keep meaning to do a history of UK open government data over the last few years, so checkpoints like this are interesting when it comes to logging the hopes and aspirations, as well as the claims that were being made in support of developing policy, from back in the day. Also on the to do list is post about how I’m increasing uncomfortable with the whole open data thing, and what motivations are actually driving it at the policy (and lobbiest) level…
How can stats and data publishers, from NGOs and (inter)national statistics agencies to scientific researchers, publish their data in a way that supports its analysis directly, as well as in combination with other datasets?
Here’s one approach I learned about from Michael Kao of the UN Food and Agriculture Organisation statistics division, FAOStat.
At first glimpse, the FAOStat website offers a rich website that supports data downloads, previews and simple analysis tools around a wide variety of international food related datasets:
One problem with having so many controls and fields available is that it can be hard to know where (or how) to get started – a bit like the problem of being presented with an empty SPARQL query box…
It would be quite handy to be able to set – and save with meaningful labels – preference sets about the countries you’re interested in so you don’t have to keep keep scrolling through long country lists looking for the countries you want to generate reports for? (Support for “standard” groupings of countries might also be useful?) Being able to share URLs to predefined reports might also be handy? But this would possibly make the site even more complex to use!
One easier way of working with FAOStat data, particularly if you access the FAO datasets regularly, might be to take a programmatic route using the FAOStat R package. Making datasets available in ways that bring that data directly into a desktop analysis environment where they can be worked on without requiring cleaning or other forms of tidying up (which is often the case when data is made available via Excel spreadsheets or CSV files) is a trend I hope we see more of. (That is not to say that data shouldn’t also be published in “generic” document formats…). If you are using a reproducible research strategy, queries to original datasources provide implicit, self-describing metadata about the data source and the query used to return a particular dataset, metadata that is all to easy to lose, or otherwise detach from a dataset when working with downloaded files.
I haven’t had chance to play with this package yet – it’s still in testing anyway, I think? – but it looks quite handy at a first glance (I need to do a proper review…). As well as providing a way of running data grab queries over theFAO FAOSTAT and World Bank WDI APIs, it seems to provide support for “linkage”. As the draft vignette suggests, “Merge is a typical data manipulation step in daily work yet a non-trivial exercise especially when working with different data sources. The built in mergeSYB function enables one to merge data from different sources as long as the country coding system is identiﬁed. … Data from any source with [a] classiﬁcation [supported by the package] can be supplied to mergeSYB in order to obtain a single merged data. (sic)“. Supported formats currently include: United Nations M49 country standard [UN_CODE]; FAO country code scheme [FAOST_CODE]; FAO Global Administrative Unit Layers (GAUL) [ADM0_CODE]; ISO 3166-1 alpha-2 [ISO2_CODE]; ISO 3166-1 alpha-2 (World Bank) [ISO2_WB_CODE]; ISO 3166-1 alpha-3 [ISO3_CODE]; ISO 3166-1 alpha-3 (World Bank) [ISO3_WB_CODE].
By releasing an “official” R package to access the FAOStat API, it occurs to me that this makes it much easier to start building sector specific Shiny applications around particular datasets? I wonder whether the FAOstat folk have considered whether there is a possibility of developing a small Shiny app or custom client ecosystem around their data, even if it just takes the form of a curated set of gists that can be downloaded directly into RStudio, for example, using runGist?
I don’t know whether the Eurostat EC Statistics database has an associated R package too? (If so, it could be quite interesting trying to tie them together?! I do note, however, that Eurostat data is available for download (though I haven’t read the terms/license conditions…).
I also note that a Linked Data/SPARQL way in to Eurostat data appears to be available? Eurostat Linked Data.
[Man flu, hence the brevity of the post… skulks back off to sick bed…]
PS BY the by, I notice that the NHS are experimenting with making some data releases available via Google Public Data Explorer [scroll down…]
PPS See also this package – Smarter Poland – which provides an API to the Eurostat database.
On June 28th, 2012, the open data policy white paper Unleashing the Potential was published by the Cabinet Office. In the section on “Opening Up Access to Research”, one particular paragraph runs as follows:
2.66 To further develop government policy on access to research, we are also establishing a Research Transparency Sector Board, chaired by the Minister for Universities and Science, which will consider ways in which transparency in the area of research can be a driver for innovation. Recognising that research data is different to other PSI [Public Sector Information, presumably? – ed.], the Board will consider how to implement transparency measures relating to research in a manner which protects the integrity of the research and associated intellectual property, while ensuring access to research for those SME entrepreneurs vital for driving growth. This will help to realise the full benefits for society as a whole. The Research Transparency Sector Board will consist of government departments, funding agencies and representatives from universities and other stakeholders, and among the first of its tasks will be to consider how to act on the recommendations of the Royal Society report.
The announcement of the board (referred to as the Research Sector Transparency Board – which makes more sense…) was welcomed by the Royal Society in a guest blog post on the data.gov.uk website dated 27th June 2012 (the day before the embargo lifted? I’m not sure when the blog post actually became public): An intelligently open enterprise.
The minutes of a Regular meeting of the ICO Higher Education sector panel on FOI and DP (24.09.2012) dated 16/10/12 notes the following:
Research data caused much concern. VA reminded delegates that she does need input from Research Councils and BIS in this area, as stated in the draft DD [HE definition document]. Definitions of “publicly funded” and “key outputs” may need clarification. It was noted that the Engineering and Physical Sciences Research Panel had to produce this type of data to an agreed timetable by 2015. It was also mentioned that the Open Data White Paper announced the formation of a new Research Sector Transparency Board and it was suggested that HEI research data could be linked to that format – it is not yet ready for use but might be worth noting in the new DD that this is a future aim.
Correspondence from House of Lords European Union Select Committee includes a letter from David Willetts MP dated 25 October 2012 that refers to his anticipated chairing of the Board:
On the question of Open Access (OA), I was pleased to note your expressed support for Open Data (OD) for which the UK is again identified as a good example. We have made excellent progress through the Finch Report on expanded access to research publications and the Government’s response to it. OD is at a relatively early stage. Some initiatives are already in train under Government’s Transparency Agenda, as detailed in the Cabinet Office White Paper, Open Data: Unleashing the Potential. This includes establishment of the Research Sector Transparency Board, which I shall be chairing. The Board will want to examine the complex issues around increasing the sharing of research data. The Research Councils’ published Open Access policy makes appropriate reference to research data, and the recent Royal Society report has informed the discussion, but work is needed on deciding further measures and implementing these appropriately, with the right terms and conditions and timing for disclosure.
We cannot be complacent and we will want to consider how best to monitor the take-up of Gold OA both here in the UK and overseas. The HEFCE-funded Joint Infrastructure Systems Committee (JISC), OAIG, and the Research Innovation Network (RIN) are already active in monitoring OA trends generally. HEFCE also envisages a possible role for JISC in monitoring the effectiveness – and effects – of Government OA policy. I expect that the Research Sector Transparency Board will also take an interest in OA policy implementation.
The 2012 BIS Annual Innovation Report from November 2012 referred to the announcement of the Board, making me wonder how many other Annual Reports celebrate the announcement of vapour
10.3 Open data and transparency
We have continued to work to harness the potential and collaborative opportunities offered by wider use of open data.
In June 2012 the Government announced in its Open Data White Paper that we would set up a Research Sector Transparency Board. The Board will consider how transparency in research can be a driver for innovation and discovery while furthering the UK’s recognised excellence in science. It will advise Government transparency issues relating to the national research effort, and improved access for small and medium businesses to the research base. Amongst its first tasks will be to consider and address the recommendations of the Royal Society report, Science as an Open Enterprise, into the sharing and disclosing of research data.
We also established the Administrative Data Taskforce, in December 2011. It will publish proposals for new mechanisms and collaborative agreements to enable and promote the wider use of administrative data for research and policy purposes, before the end of the year.
(I’m not sure I’d picked up on the Administrative Data Taskforce before? It reported in December 2012: The UK Administrative Data Research Network: Improving Access for Research and Policy. This report looks like it could be worth reading – a quick skim reveals several sections on legal and ethical issues related to linking administrative data to other dataset.)
A Hansard reported Written Answer to the House of Lords from 12 Dec 2012 (Column WA241) from The Parliamentary Under-Secretary of State, Department for Business, Innovation and Skills (Lord Marland) on questions referring to open access to research data records:
Any further opening up of access to data, in the context of the wider open data agenda, would be the subject of future discussions with the research councils and other parties including the Data Strategy Board and representative university bodies. These policy issues would also be considered as appropriate by the Research Sector Transparency Board which is chaired by David Willetts. There are no proposals to change the research councils’ policy on access to data at this time.
The Russell Group response to the House of Lords Science and Technology Committee’s inquiry on open access publishing, dated 24 January 2013, makes the following reference to the board:
1.3 The Russell Group has been monitoring the development of open access (OA) policy for some time. We followed the ‘Finch Review’ and Royal Society work on science as an open enterprise with interest and the Russell Group is now represented on the Research Sector Transparency Board which will be covering OA, open data and other issues over the coming year. We have recently had a number of meetings with Research Councils UK (RCUK) to discuss implementation of OA policy.
This suggests that membership of the board has been decided upon, at least partially?
A HEFCE letter on Open access and submissions to the REF post-2014 dated 25/2/13 refers to the board in the following terms:
25. With the Research Councils and the Research Transparency Sector Board, we are giving consideration to the issues involved in increasing access to research data. We are committed to working in dialogue with the sector to develop fair and balanced mechanisms to achieve this aim.
Again, this suggests that the Board has been convened.
So I wonder:
- What is tha actual name of the board – Research Transparency Sector Board or Research Sector Transparency Board ;-)? (Other sectors have Transparency Boards….)
- What is the membership of the board and has it convened yet?
- What are the terms of reference for the board?
- If it has convened, where are the minutes?
By the by, I note the emergence of the Research Councils UK – Gateway to Research, which provides a single point of access to “[k]ey data from the seven UK Research Councils in one location.”
This site appears to collate information about research grants, grantees, and publications by grant, across the Research Councils (I’m not sure if an #opendata dump is available though, which would mean I don’t need to scrape across all the sites using Scraperwiki any more?!;-)
PS it seems a tweet about the first meeting appeared whilst I was writing this post:
First meeting of the Research Sector Transparency Board today and all agree that open data are a public good – but that issue is complicated
— adam tickell (@adamtickell) February 26, 2013
No linkage that I can see yet, though?
A couple of weeks ago, I gave a presentation to the WebScience students at the University of Southampton on the topic of open data, using it as an opportunity to rehearse a view of open data based on the premise that it starts out closed. In much the same way that Darwin’s Theory of Evolution by Natural Selection is based on a major presupposition, specifically a theory of inheritance and the existence of processes that support reproduction with minor variation, so too does much of our thinking about open data derive from the presupposed fact that many of the freedoms we associate with the use of open data in legal terms arise from license conditions that the “owner” of the data awards to us.
Viewing data in this light, we might start by considering what constitutes “closed” data and how it comes to be so, before identifying the means by which freedoms are granted and the data is opened up. (Sometimes it can also be easier to consider what you can’t do than what you can, especially when answers to questions such as “so what can you actually do with open data?” attract the (rather meaningless) response: “anything”. We can then contrast what you can do in terms of freedom complementary to what you can’t…)
So how can data be “closed”?
One lens I particularly like for considering constraints that are placed on actions and actors, particularly in the digital world (although we can apply the model elsewhere) I first saw described by Lawrence Lessig in Code and Other Laws of Cyberspace: What Things Regulate: A Dot’s Life.
Here’s the dot and the forces that constrain its behaviour:
So we see, for example, the force of law, social norms, the market (that is, economic forces) and architecture, that is the “digital physical” way the world is implemented. (Architecture may of course be designed in order to enforce particular laws, but it is likely that other “natural laws” will arise as a result of any particular architecture or system implementation.)
Without too much thought, we might identify some constraints around data and its use under each of these separate lenses. For example:
- Law: copyright and database right grant the creator of a dataset certain protective rights over that data; data protection laws (and other “privacy laws”) limit access to, or disclosure of, data that contains personal information, as well as restricting the use of that data for purposes disclosed at the time it was collected. The UK Data Protection Act also underwrites the right of individuals to claim additional limits on data use, for example the rights “to object to processing that is likely to cause or is causing damage or distress to prevent processing for direct marketing; to object to decisions being taken by automated means” (ICO Guide to the DPA, Principle 6 – The rights of individuals).
- Norms: social mores, behaviour and taboos limit the ways in which we might use data, even if that use is not constrained by legal, economic or technical concerns. For example, applications that invite people to “burgle my house” based on analysing social network data to discover when they are likely to be away from home and what sorts of valuable product might be on the premises are generally not welcomed. Norms of behaviour and everyday workpractice also mean that much data is not published when theere are no real reasons why it couldn’t be.
- Market: in the simplest case, charging for access to data places a constraint on who can gain access to the data even in advance of trying to make use of it. If we extend “market” to cover other financial constraints, there may be a cost associated with preparing data so that it can be openly released.
- Architecture: technical constraints can restrict what you can do with data. Digital rights management (DRM) uses encryption to render data streams unusable to all but the intended client, but more prosaically, document formats such as PDF or the “release” of data charts are flat image files makes it difficult for the end user to manipulate as data any data resources contained in those documents.
Laws can also be used to grant freedoms where freedoms are otherwise restricted. For example:
- the Freedom of Information Act (FOI) provides a mechanism for requesting copies of datasets from public bodies; in addition, the Environmental Information Regulations “provide public access to environmental information held by public authorities”.
- the laws around copyright relax certain copyright constraints for the purposes of criticism and review, reporting, research, teaching (IPO – Permitted uses of copyright works);
- in the UK, the Data Protection Act provides for “a right of access to a copy of the information comprised in their personal data” (ICO Guide to the DPA, Principle 6).
- in the UK, the Data Protection Act regulates what can be done legitimately with “personal” data. However, other pieces of legislation relax confidentiality requirements when it comes to sharing data for research purposes. For example:
- the NHS Act s. 251 Control of patient information; for example, the Secretary of State for Health may “make regulations to set aside the common law duty of confidentiality for medical purposes where it is not possible to use anonymised information and where seeking individual consent is not practicable” (discussion). Note that they are changes afoot regarding s. 251…
- The Secretary of State for Education has specific powers to share pupil data from the National Pupil database (NPD) “with named bodies and third parties who require access to the data to undertake research into the educational achievements of pupils”. The NPD “tracks a pupil’s progress through schools and colleges in the state sector, using pupil census and exam information. Individual pupil level attainment data is also included (where available) for pupils in non-maintained and independent schools” (access arrangements).
- the Enterprise and Regulatory Reform Bill currently making its way through Parliament legislates around the Supply of Customer Data (the “#midata” clauses) which is intended to open up access to customer transaction data from suppliers of energy, financial services and mobile phones “(a) to a customer, at the customer’s request; (b) to a person who is authorised by a customer to receive the data, at the customer’s request or, if the regulations so provide, at the authorised person’s request.” Although proclaimed as a way of opening up individual rights to access this data, the effect will more likely see third parties enticing individuals to authorise the release to the third party of the individual first party’s personal transaction data held by a second party (for example, #Midata Is Intended to Benefit Whom, Exactly?). (So you’ll presumably legally be able to grant Facebook access to your mobile phone records… Or Facebook will find a way of getting you to release that data to them without you realising you granted them that permission;-)
Contracts (which I guess fall somewhere between norms and laws from the dot’s perspective (I need to read that section of Lessig’s book again!) can also be used by rights holders to grant freedoms over the data they hold the rights for. For example, the Creative Commons licensing framework provides a copyright holder with a set of tools for relaxing some of the rights afforded to them by copyright when they license the work accordingly.
Note that “I am not a lawyer”, so my understanding of all this is pretty hazy;-) I also wonder how the various pieces of legislation interact, and whether there are cracks and possible inconsistencies between them? If there are pieces of legislation around the regulation and use of data that I’m missing, please post links in the comments below, and I’ll try and do a more thorough round up in a follow on post.