With my growing unease about just what the agenda driving open government/public data is, I think I’m going to have to find some time away to walk the dog lots, and mull over what pieces might be part of the jigsaw, as well as having a go at trying to put some of them together…
Near the top of the list is a concern about information asymmetry and how open data may be used by private concerns to provide a one-off advantage for them when it comes to poaching services from the public sector. How so? My gut reaction thinking is this: if, as part of the procurement process, the private sector can use open public data to help it secure a contract in competition with a public sector provider, then when contracts come to renewal the public sector may know less when it comes to bid than the private sector company was able to learn when it first tendered. The question here is: does open public data put private sector companies in an advantage when it comes to bidding for public service contracts against an encumbent public provider compared to a public body bidding to recapture a service from an encumbent private provider, given that the private provider may not be required to open up information (for example, through FOI requests, transparency or public reporting obligations) in the same way that a public body is.
Another take on a similar theme is the extent to which there may be a loss of transparency when a service goes from a public to a private provider. If we think there is some benefit to be had from transparency in general terms, then private providers of public services should have the same openness requirements placed on them as the public body. If private companies can claim revealing information is against their commercial interest, can public bodies make the same claims on exactly the same terms under FOI exemption rules, for example (eg MoJ Freedom of information guidance: Exemptions guidance – Section 43: Commercial interests).
Taking the NHS as a case example, here are a few things on my reading list:
- Monitor report from March 2013 on A fair playing field for the benefit of NHS patients [actual report]. For example, the report identified the following distortions:
1. Participation distortions. Some providers are directly or indirectly excluded from offering their services to NHS patients for reasons other than quality or efficiency. Restrictions on participation disadvantage providers seeking to expand into new services or new areas, regardless of whether the providers are public, charitable or private. Participation distortions disadvantage nonincumbent providers of every type.
2. Cost distortions. Some types of provider face externally imposed costs that do not fall on other providers. On balance, cost distortions mostly disadvantage charitable and private health care providers compared to public providers.
3. Flexibility distortions. Some providers’ ability to adapt their services to the changing needs of patients and commissioners is constrained by factors outside their control. These flexibility distortions mostly disadvantage public sector providers compared to other types.
I’m not sure to what extent, if any, the report reviews distortions and asymmetries arising from open data issues.
A search of the report for mentions of FOI turns up:
Historically, public providers have faced higher levels of scrutiny than other providers, including requests for information under the Freedom of Information Act. This degree of scrutiny can improve accountability to patients and promote good practice. Freedom of Information requirements have been extended through the standard NHS contract to private and charitable providers. However, it is not clear that this is operating effectively as yet, and other aspects of transparency do not apply across all types of provider.
29. The Government and commissioners should ensure that transparency, including Freedom of Information requirements, is implemented across all types of provider of NHS services on a consistent basis.
As I said, it’s on the reading list…
- A terrifying post on the Computer Weekly/Public Sector IT blog – NHS watchdog commandeers data in bid to stimulate privatization and an earlier one on the naive take on hospital mortality data: Data regime makes merciless start on NHS privatization. Are there any reports or strategy documents from the Care Quality Commission (CQC) I need to add to my reading list?
- Something academic… such as this piece from the Proceedings of the 21st European Conference on Information Systems on The Generative Mechanisms of Open Government Data, much of which I suspect is summarised by these two figures taken (without permission) from the the paper:
- Opening up data (particularly data held by public bodies) around private companies is another area I can quite get my head round, particularly when it comes to comparing information about the machinations of private companies as compared to public bodies. To what extent should companies that are public and limited liability have data that is held by them by public bodies be openly available, for example? Maybe related to this is a currently open BIS consultation: Company ownership: transparency and trust discussion paper as well as the HMRC consultationon Sharing and publishing data for public benefit (press release) that I linked to from yesterday’s Rambling Round-Up of Some Recent #OpenData Notices. (OpenCorporate’s Chris Taggart posts some interesting thoughts on the sheen given to the proposed release of VAT registration data to credit agencies that the consultation is in part based around: Open tax data, or just VAT ‘open wash’.) A recent edition of File on 4 (h/t @onthewight) on charity based tax fraud – Faith, Hope and… Tax Avoidance – also got me wondering further about what information is openly available about charities’ activities (eg A Quick Peek at Some Charities Data…)?
- One to dig for… via Lexology, a post on Freedom of information in the private sector? which claims that “The Confederation of British Industry (“CBI“) has revealed that it is developing ‘transparency guidelines’ that will apply to private companies that provide services to the NHS.” Have these appeared yet, even in draft or consultation form?
A few other things on my to-do list in this area: map out the lobbiests and board/panel members around open data; use disclosure logs to search for companies putting in FOI requests in different sectors; see who’s pitching ideas in to ODUG; map out who’s funding NGOs and activities in the opendata space.
Sigh… no time…;-)
PS Not sure if there is a full paper version of this…? Bates, J. 2013, Information policy in the crises of neoliberalism: the case of Open Government Data in the UK at International Association of Media and Communications Researchers Conference, Dublin, June 2013: “Whilst open data releases by the UK government have received substantial support within UK civil society, often being interpreted as a creative and innovative response to a range of social issues, and, for some, a radical challenge to key components of neoliberal capitalism, this paper argues that deeper analysis of the OGD initiative suggests that it is being shaped by the UK government and corporate interests in an attempt to leverage a distinctly neoliberal agenda. The adoption and development of the OGD agenda as core to the policy response adopted by the UK Government to conditions of political economic crisis, suggests that information policy is being implemented as a key, yet often opaque, element of the neoliberal policy toolbox.” See also an earlier paper, “This is what modern deregulation looks like”: Co-optation and contestation in the shaping of the UK’s Open Government Data Initiative (“whilst OGD [Open Government Data] might potentially support modes of transparent and democratic governance, the current ‘transparency agenda’ should be recognised as an initiative that also aims to enable the marketisation of public services, and this is something that is not readily apparent to the general observer.”) and a statement of Jo Bates current research project in the are: The politics of Open Government Data in the UK
PPS Supporting the idea of symmetry in reporting between public services and private companies delivering public services, Richard Murphy on Making public services accountable. And some excellent writings critiquing computational thinking and the teaching of code by Ben Williamson.
It’s been some time since I (b)logged recent reports and announcements relating to the ongoing evolution of the open data thang in the UK, so here’s a quick round-up of some of the things I have floating in my open tabs…
- Secretary of State’s Code of Practice (datasets) on the discharge of public authorities’ functions under Part 1 of the Freedom of Information Act – I guess this is the big one, the latest code of practice relating to the release of datasets under FOI. (Owen Boswarva has also compared it to the consulted upon draft.) The ICO give a quick overview as well as a specialist guidance on Datasets (FOI sections 11, 19 & 45). A sceptic might say it looks like FOIable bodies have also been given the wherewithal to set up their own data trading funds, enabled by The Freedom of Information (Release of Datasets for Re-use) (Fees) Regulations 2013. In passing, this looks like a handy place to catch up on FOI round-ups.
- Via Out-Law, HMRC consults on plans to release anonymised tax datasets, I notice that HMRC has a new consultation out: Sharing and publishing data for public benefit. Apparently, [t]his consultation brings forward three options:
- wider sharing of aggregated and anonymised tax data, for example, for the purposes of research or policy development;
- release of basic non-financial VAT registration data as public data; and
- sharing more detailed VAT registration data on a more restricted and controlled basis for specific purposes, such as credit referencing.
At first glance, section 2.4 read to me like the hatchets are out on the NHS and the marshalling of resources to drive its privatisation continues ever onwards. From the consultation, I noticed that a Tax Sector Transparency Board was set up at the end of last year, which brings the number of sector transparency/data boards to about 437, I think? (Try searching for site:gov.uk inurl:sector-transparency-board.)
In passing, I also note this response to an FOI request from 2010 in relation to accessing company VAT number data:
I believe that disclosure of a complete list of VAT numbers currently in use would be likely to prejudice the prevention or detection of crime and the assessment or collection of VAT. I have reached this conclusion as I believe that the requested information could be used by opportunistic individuals and fraudsters to hijack genuine VAT numbers in order to fraudulently present themselves to HMRC, to other traders or to prospective customers as VAT registered. VAT is charged when a VAT-registered business sells to either another business or to a non-business customer. When VAT-registered businesses buy goods or services they can generally reclaim the VAT they have paid. If fraudsters are able to charge or reclaim VAT when they are not entitled to do so, then this will result in loss to the Public Purse and to members of the public who fall victim to such fraud.
Section 31 is a qualified exemption which means that, if it applies, I must consider whether it is the public interest to override the exemption and release the information. I have very carefully considered this but have decided that on balance it is not in the public interest to release this information.
- The ICO is also in consulting mode, running a Consultation on the “Conducting privacy impact assessments” code of practice. The consultation isn’t posted on page with a sensible URL though, it’s linked via the “current consultations” page, so if you’re reading this a month or two after the time of writing, you’ll probably need to look in the closed consultations area of the site. Go figure…
- I had a little play with FOI myself recently. G4S was in the news again in a minor scandal about overcharging the Ministry of Justice on tagging contracts. I thought I’d have a peek at the MoJ spending data with respect to G4S, but they’ve been slipping in their transparency duties, so I felt obliged to FOI the spending data. No reply as yet – and I’m not sure if the data has gone up via their transparency pages yet, either?
- I’ve recently started picking up on the creation of research panels and government department data labs. For example, the HMRC datalab and the more recent Justice data lab, which looks like an interesting resource for charities and other agencies working in the justice sector who need to demonstrate impact… HEFCE are also trying to open up access, sort of, to student survey data by means of the National Student Survey research panel. I suspect that the NHS (and the DfE, eg via the National Pupil Database) have data access initiatives, as well as data linkage services? For example, the Linked Hospital Episode Statistics and Mental Health Minimum Data Set. In the academic health research area, see also the Expert Advisory Group on Data Access.
- A handful of recent reports on how open data is perceived and being used: from Sciencewise, a June 2013 report on Public views on open data; from the Department for Work and Pensions (DWP), a couple of brief reports on “how DWP uses transparency and open data to improve public services and accountability”. Alternatively, for a few thousand dollars, you can get a Forrester Research report on Getting The Most Out Of Open Data. One of the opendata “success” stories I heard championed most recently was by Sir Nigel Shadbolt at the Guardian Activate Summit. Apparently, the release of spending data has resulted in the “success” of private companies selling procurement advice based on an analysis of the data back to the public bodies, though I don’t think any specifics were mentioned. Are there any papers out there looking at how open data is being used to drive privatisation and destroy public services, I wonder?
- More research centre initiatives, from another report that I missed when it came out in December 2012 – The UK Administrative Data Research Network: Improving Access for Research and Policy. It would be interesting to see how the models proposed in this report compare to the structures used by the government datalabs?
And finally, an even older report I’d not picked up on before. From the Audit Commission in March 2010, a discussion paper: The Truth is Out There – Transparency in an Information Age. I keep meaning to do a history of UK open government data over the last few years, so checkpoints like this are interesting when it comes to logging the hopes and aspirations, as well as the claims that were being made in support of developing policy, from back in the day. Also on the to do list is post about how I’m increasing uncomfortable with the whole open data thing, and what motivations are actually driving it at the policy (and lobbiest) level…
How can stats and data publishers, from NGOs and (inter)national statistics agencies to scientific researchers, publish their data in a way that supports its analysis directly, as well as in combination with other datasets?
Here’s one approach I learned about from Michael Kao of the UN Food and Agriculture Organisation statistics division, FAOStat.
At first glimpse, the FAOStat website offers a rich website that supports data downloads, previews and simple analysis tools around a wide variety of international food related datasets:
One problem with having so many controls and fields available is that it can be hard to know where (or how) to get started – a bit like the problem of being presented with an empty SPARQL query box…
It would be quite handy to be able to set – and save with meaningful labels – preference sets about the countries you’re interested in so you don’t have to keep keep scrolling through long country lists looking for the countries you want to generate reports for? (Support for “standard” groupings of countries might also be useful?) Being able to share URLs to predefined reports might also be handy? But this would possibly make the site even more complex to use!
One easier way of working with FAOStat data, particularly if you access the FAO datasets regularly, might be to take a programmatic route using the FAOStat R package. Making datasets available in ways that bring that data directly into a desktop analysis environment where they can be worked on without requiring cleaning or other forms of tidying up (which is often the case when data is made available via Excel spreadsheets or CSV files) is a trend I hope we see more of. (That is not to say that data shouldn’t also be published in “generic” document formats…). If you are using a reproducible research strategy, queries to original datasources provide implicit, self-describing metadata about the data source and the query used to return a particular dataset, metadata that is all to easy to lose, or otherwise detach from a dataset when working with downloaded files.
I haven’t had chance to play with this package yet – it’s still in testing anyway, I think? – but it looks quite handy at a first glance (I need to do a proper review…). As well as providing a way of running data grab queries over theFAO FAOSTAT and World Bank WDI APIs, it seems to provide support for “linkage”. As the draft vignette suggests, “Merge is a typical data manipulation step in daily work yet a non-trivial exercise especially when working with different data sources. The built in mergeSYB function enables one to merge data from different sources as long as the country coding system is identiﬁed. … Data from any source with [a] classiﬁcation [supported by the package] can be supplied to mergeSYB in order to obtain a single merged data. (sic)“. Supported formats currently include: United Nations M49 country standard [UN_CODE]; FAO country code scheme [FAOST_CODE]; FAO Global Administrative Unit Layers (GAUL) [ADM0_CODE]; ISO 3166-1 alpha-2 [ISO2_CODE]; ISO 3166-1 alpha-2 (World Bank) [ISO2_WB_CODE]; ISO 3166-1 alpha-3 [ISO3_CODE]; ISO 3166-1 alpha-3 (World Bank) [ISO3_WB_CODE].
By releasing an “official” R package to access the FAOStat API, it occurs to me that this makes it much easier to start building sector specific Shiny applications around particular datasets? I wonder whether the FAOstat folk have considered whether there is a possibility of developing a small Shiny app or custom client ecosystem around their data, even if it just takes the form of a curated set of gists that can be downloaded directly into RStudio, for example, using runGist?
I don’t know whether the Eurostat EC Statistics database has an associated R package too? (If so, it could be quite interesting trying to tie them together?! I do note, however, that Eurostat data is available for download (though I haven’t read the terms/license conditions…).
I also note that a Linked Data/SPARQL way in to Eurostat data appears to be available? Eurostat Linked Data.
[Man flu, hence the brevity of the post… skulks back off to sick bed…]
PS BY the by, I notice that the NHS are experimenting with making some data releases available via Google Public Data Explorer [scroll down…]
PPS See also this package – Smarter Poland – which provides an API to the Eurostat database.
On June 28th, 2012, the open data policy white paper Unleashing the Potential was published by the Cabinet Office. In the section on “Opening Up Access to Research”, one particular paragraph runs as follows:
2.66 To further develop government policy on access to research, we are also establishing a Research Transparency Sector Board, chaired by the Minister for Universities and Science, which will consider ways in which transparency in the area of research can be a driver for innovation. Recognising that research data is different to other PSI [Public Sector Information, presumably? – ed.], the Board will consider how to implement transparency measures relating to research in a manner which protects the integrity of the research and associated intellectual property, while ensuring access to research for those SME entrepreneurs vital for driving growth. This will help to realise the full benefits for society as a whole. The Research Transparency Sector Board will consist of government departments, funding agencies and representatives from universities and other stakeholders, and among the first of its tasks will be to consider how to act on the recommendations of the Royal Society report.
The announcement of the board (referred to as the Research Sector Transparency Board – which makes more sense…) was welcomed by the Royal Society in a guest blog post on the data.gov.uk website dated 27th June 2012 (the day before the embargo lifted? I’m not sure when the blog post actually became public): An intelligently open enterprise.
The minutes of a Regular meeting of the ICO Higher Education sector panel on FOI and DP (24.09.2012) dated 16/10/12 notes the following:
Research data caused much concern. VA reminded delegates that she does need input from Research Councils and BIS in this area, as stated in the draft DD [HE definition document]. Definitions of “publicly funded” and “key outputs” may need clarification. It was noted that the Engineering and Physical Sciences Research Panel had to produce this type of data to an agreed timetable by 2015. It was also mentioned that the Open Data White Paper announced the formation of a new Research Sector Transparency Board and it was suggested that HEI research data could be linked to that format – it is not yet ready for use but might be worth noting in the new DD that this is a future aim.
Correspondence from House of Lords European Union Select Committee includes a letter from David Willetts MP dated 25 October 2012 that refers to his anticipated chairing of the Board:
On the question of Open Access (OA), I was pleased to note your expressed support for Open Data (OD) for which the UK is again identified as a good example. We have made excellent progress through the Finch Report on expanded access to research publications and the Government’s response to it. OD is at a relatively early stage. Some initiatives are already in train under Government’s Transparency Agenda, as detailed in the Cabinet Office White Paper, Open Data: Unleashing the Potential. This includes establishment of the Research Sector Transparency Board, which I shall be chairing. The Board will want to examine the complex issues around increasing the sharing of research data. The Research Councils’ published Open Access policy makes appropriate reference to research data, and the recent Royal Society report has informed the discussion, but work is needed on deciding further measures and implementing these appropriately, with the right terms and conditions and timing for disclosure.
We cannot be complacent and we will want to consider how best to monitor the take-up of Gold OA both here in the UK and overseas. The HEFCE-funded Joint Infrastructure Systems Committee (JISC), OAIG, and the Research Innovation Network (RIN) are already active in monitoring OA trends generally. HEFCE also envisages a possible role for JISC in monitoring the effectiveness – and effects – of Government OA policy. I expect that the Research Sector Transparency Board will also take an interest in OA policy implementation.
The 2012 BIS Annual Innovation Report from November 2012 referred to the announcement of the Board, making me wonder how many other Annual Reports celebrate the announcement of vapour
10.3 Open data and transparency
We have continued to work to harness the potential and collaborative opportunities offered by wider use of open data.
In June 2012 the Government announced in its Open Data White Paper that we would set up a Research Sector Transparency Board. The Board will consider how transparency in research can be a driver for innovation and discovery while furthering the UK’s recognised excellence in science. It will advise Government transparency issues relating to the national research effort, and improved access for small and medium businesses to the research base. Amongst its first tasks will be to consider and address the recommendations of the Royal Society report, Science as an Open Enterprise, into the sharing and disclosing of research data.
We also established the Administrative Data Taskforce, in December 2011. It will publish proposals for new mechanisms and collaborative agreements to enable and promote the wider use of administrative data for research and policy purposes, before the end of the year.
(I’m not sure I’d picked up on the Administrative Data Taskforce before? It reported in December 2012: The UK Administrative Data Research Network: Improving Access for Research and Policy. This report looks like it could be worth reading – a quick skim reveals several sections on legal and ethical issues related to linking administrative data to other dataset.)
A Hansard reported Written Answer to the House of Lords from 12 Dec 2012 (Column WA241) from The Parliamentary Under-Secretary of State, Department for Business, Innovation and Skills (Lord Marland) on questions referring to open access to research data records:
Any further opening up of access to data, in the context of the wider open data agenda, would be the subject of future discussions with the research councils and other parties including the Data Strategy Board and representative university bodies. These policy issues would also be considered as appropriate by the Research Sector Transparency Board which is chaired by David Willetts. There are no proposals to change the research councils’ policy on access to data at this time.
The Russell Group response to the House of Lords Science and Technology Committee’s inquiry on open access publishing, dated 24 January 2013, makes the following reference to the board:
1.3 The Russell Group has been monitoring the development of open access (OA) policy for some time. We followed the ‘Finch Review’ and Royal Society work on science as an open enterprise with interest and the Russell Group is now represented on the Research Sector Transparency Board which will be covering OA, open data and other issues over the coming year. We have recently had a number of meetings with Research Councils UK (RCUK) to discuss implementation of OA policy.
This suggests that membership of the board has been decided upon, at least partially?
A HEFCE letter on Open access and submissions to the REF post-2014 dated 25/2/13 refers to the board in the following terms:
25. With the Research Councils and the Research Transparency Sector Board, we are giving consideration to the issues involved in increasing access to research data. We are committed to working in dialogue with the sector to develop fair and balanced mechanisms to achieve this aim.
Again, this suggests that the Board has been convened.
So I wonder:
- What is tha actual name of the board – Research Transparency Sector Board or Research Sector Transparency Board ;-)? (Other sectors have Transparency Boards….)
- What is the membership of the board and has it convened yet?
- What are the terms of reference for the board?
- If it has convened, where are the minutes?
By the by, I note the emergence of the Research Councils UK – Gateway to Research, which provides a single point of access to “[k]ey data from the seven UK Research Councils in one location.”
This site appears to collate information about research grants, grantees, and publications by grant, across the Research Councils (I’m not sure if an #opendata dump is available though, which would mean I don’t need to scrape across all the sites using Scraperwiki any more?!;-)
PS it seems a tweet about the first meeting appeared whilst I was writing this post:
First meeting of the Research Sector Transparency Board today and all agree that open data are a public good – but that issue is complicated
— adam tickell (@adamtickell) February 26, 2013
No linkage that I can see yet, though?
A couple of weeks ago, I gave a presentation to the WebScience students at the University of Southampton on the topic of open data, using it as an opportunity to rehearse a view of open data based on the premise that it starts out closed. In much the same way that Darwin’s Theory of Evolution by Natural Selection is based on a major presupposition, specifically a theory of inheritance and the existence of processes that support reproduction with minor variation, so too does much of our thinking about open data derive from the presupposed fact that many of the freedoms we associate with the use of open data in legal terms arise from license conditions that the “owner” of the data awards to us.
Viewing data in this light, we might start by considering what constitutes “closed” data and how it comes to be so, before identifying the means by which freedoms are granted and the data is opened up. (Sometimes it can also be easier to consider what you can’t do than what you can, especially when answers to questions such as “so what can you actually do with open data?” attract the (rather meaningless) response: “anything”. We can then contrast what you can do in terms of freedom complementary to what you can’t…)
So how can data be “closed”?
One lens I particularly like for considering constraints that are placed on actions and actors, particularly in the digital world (although we can apply the model elsewhere) I first saw described by Lawrence Lessig in Code and Other Laws of Cyberspace: What Things Regulate: A Dot’s Life.
Here’s the dot and the forces that constrain its behaviour:
So we see, for example, the force of law, social norms, the market (that is, economic forces) and architecture, that is the “digital physical” way the world is implemented. (Architecture may of course be designed in order to enforce particular laws, but it is likely that other “natural laws” will arise as a result of any particular architecture or system implementation.)
Without too much thought, we might identify some constraints around data and its use under each of these separate lenses. For example:
- Law: copyright and database right grant the creator of a dataset certain protective rights over that data; data protection laws (and other “privacy laws”) limit access to, or disclosure of, data that contains personal information, as well as restricting the use of that data for purposes disclosed at the time it was collected. The UK Data Protection Act also underwrites the right of individuals to claim additional limits on data use, for example the rights “to object to processing that is likely to cause or is causing damage or distress to prevent processing for direct marketing; to object to decisions being taken by automated means” (ICO Guide to the DPA, Principle 6 – The rights of individuals).
- Norms: social mores, behaviour and taboos limit the ways in which we might use data, even if that use is not constrained by legal, economic or technical concerns. For example, applications that invite people to “burgle my house” based on analysing social network data to discover when they are likely to be away from home and what sorts of valuable product might be on the premises are generally not welcomed. Norms of behaviour and everyday workpractice also mean that much data is not published when theere are no real reasons why it couldn’t be.
- Market: in the simplest case, charging for access to data places a constraint on who can gain access to the data even in advance of trying to make use of it. If we extend “market” to cover other financial constraints, there may be a cost associated with preparing data so that it can be openly released.
- Architecture: technical constraints can restrict what you can do with data. Digital rights management (DRM) uses encryption to render data streams unusable to all but the intended client, but more prosaically, document formats such as PDF or the “release” of data charts are flat image files makes it difficult for the end user to manipulate as data any data resources contained in those documents.
Laws can also be used to grant freedoms where freedoms are otherwise restricted. For example:
- the Freedom of Information Act (FOI) provides a mechanism for requesting copies of datasets from public bodies; in addition, the Environmental Information Regulations “provide public access to environmental information held by public authorities”.
- the laws around copyright relax certain copyright constraints for the purposes of criticism and review, reporting, research, teaching (IPO – Permitted uses of copyright works);
- in the UK, the Data Protection Act provides for “a right of access to a copy of the information comprised in their personal data” (ICO Guide to the DPA, Principle 6).
- in the UK, the Data Protection Act regulates what can be done legitimately with “personal” data. However, other pieces of legislation relax confidentiality requirements when it comes to sharing data for research purposes. For example:
- the NHS Act s. 251 Control of patient information; for example, the Secretary of State for Health may “make regulations to set aside the common law duty of confidentiality for medical purposes where it is not possible to use anonymised information and where seeking individual consent is not practicable” (discussion). Note that they are changes afoot regarding s. 251…
- The Secretary of State for Education has specific powers to share pupil data from the National Pupil database (NPD) “with named bodies and third parties who require access to the data to undertake research into the educational achievements of pupils”. The NPD “tracks a pupil’s progress through schools and colleges in the state sector, using pupil census and exam information. Individual pupil level attainment data is also included (where available) for pupils in non-maintained and independent schools” (access arrangements).
- the Enterprise and Regulatory Reform Bill currently making its way through Parliament legislates around the Supply of Customer Data (the “#midata” clauses) which is intended to open up access to customer transaction data from suppliers of energy, financial services and mobile phones “(a) to a customer, at the customer’s request; (b) to a person who is authorised by a customer to receive the data, at the customer’s request or, if the regulations so provide, at the authorised person’s request.” Although proclaimed as a way of opening up individual rights to access this data, the effect will more likely see third parties enticing individuals to authorise the release to the third party of the individual first party’s personal transaction data held by a second party (for example, #Midata Is Intended to Benefit Whom, Exactly?). (So you’ll presumably legally be able to grant Facebook access to your mobile phone records… Or Facebook will find a way of getting you to release that data to them without you realising you granted them that permission;-)
Contracts (which I guess fall somewhere between norms and laws from the dot’s perspective (I need to read that section of Lessig’s book again!) can also be used by rights holders to grant freedoms over the data they hold the rights for. For example, the Creative Commons licensing framework provides a copyright holder with a set of tools for relaxing some of the rights afforded to them by copyright when they license the work accordingly.
Note that “I am not a lawyer”, so my understanding of all this is pretty hazy;-) I also wonder how the various pieces of legislation interact, and whether there are cracks and possible inconsistencies between them? If there are pieces of legislation around the regulation and use of data that I’m missing, please post links in the comments below, and I’ll try and do a more thorough round up in a follow on post.
I’m doing a couple of talks to undergrad and postgrad students next work – on data journalism at the University of Lincoln, and on open data at the University of Southampton – so I thought I’d do a quick round up of recently advertised data related jobs that I could reuse for an employability slide…
So, here are some of the things I’ve noticed recently:
- The Technology Strategy board, funders of many a data related activity (including the data vouchers for SMEs) are advertising for a Lead Technologist – Data Economy (£45,000 to £55,000):
The UK is increasingly reliant on its service economy, and on the ability to manage its physical economy effectively, and it exports these capabilities around the world. Both aspects of this are heavily dependent on the availability of appropriate information at the right place and time, which in turn depends on our ability to access and manipulate diverse sources of data within a commercial environment.
The internet and mobile communications and the ready availability of computing power can allow the creation of a new, data-rich economy, but there are technical, human and business challenges still to be overcome. With its rich data resources, inventive capacity and supportive policy landscape, the UK is well placed to be the centre of this innovation.
Working within the Digital team, to develop and implement strategies for TSB’s interventions in and around the relevant sectors.
This role requires the knowledge and expertise to develop priorities for how the UK should address this opportunity, as well as the interpersonal skills to introduce the relevant communities of practice to appropriate technological solutions. It also requires a knowledge of how innovation works within businesses in this space, to allow the design and targeting of TSB’s activities to effectively facilitate change.
Accessible tools include, but are not restricted to, networking and community building, grant-funding of projects at a wide range of scales, directing support services to businesses, work through centres such as the Open Data Institute and Connected Digital Economy Catapult, targeted procurement through projects such as LinkedGov, and inputs to policy. The role requires drawing upon this toolkit to design a coordinated programme of interventions that has impact in its own right and which also coordinates with other activities across TSB and the wider innovation landscape.
- Via the ECJ, a relayed message from the NICAR-L mailing list about a couple of jobs going with The Times and Sunday Times:
A couple of jobs that might be of interest to NICAR members here at the
Times of London…
The first is an investigative data journalist role, joining the new data journalism unit which will work across both The Times and The Sunday Times.
The other is a editorial developer role: this will sit within the News Development Team and will focus on anything from working out how we tell stories in richer more immersive ways, to creating new ways of presenting Times and Sunday Times journalism to new audiences.
Please get in touch if you are interested!
Head of news development, The Times and Sunday Times
Not a job ad as such, but an interesting recent innovation from the BirminghamMail:
We’ve launched a new initiative looking at the numbers behind our city and the stories in it.
‘Behind The Numbers’ is all about the explosion in ‘data’: information about our hospitals and schools, crime and the way it is policed, business and sport, arts and culture.
We’d like you to tell us what data you’d like us to publish and dig into. Email suggestions to email@example.com. Follow @bhamdatablog on Twitter for updates or to share ideas.
This was also new to me: FT Data, a stats/datablog from the FT? FullFact is another recent addition to my feed list, with a couple of interesting stories each day and plenty of process questions and methodological tricks that can be, erm, appropriated ;-) Via @JackieCarter, the Social Statistics blog looked interesting, but the partial RSS feed is a real turn off for me so I’ll probably drop it from my reader pretty quickly unless it turns up some *really* interesting posts.
Here are some examples of previously advertised jobs…
- A job that was being advertised at the end of last year (now closed) by the Office of National Statistics (ONS) (current vacancies) was for the impressive sounding Head of Rich Content Development:
The postholder is responsible for inspiring and leading development of innovative rich content outputs for the ONS website and other channels, which anticipate and meet user needs and expectations, including those of the Citizen User. The role holder has an important part to play in helping ONS to realise its vision “for official statistics to achieve greater impact on key decisions affecting the UK and to encourage broader use across the country”.
1. Inspires, builds, leads and develops a multi-disciplinary team of designers, developers, data analysts and communications experts to produce innovative new outputs for the ONS website and other channels.
2. Keeps abreast of emerging trends and identifies new opportunities for the use of rich web content with ONS outputs.
3. Identifies new opportunities, proposes new directions and developments and gains buy in and commitment to these from Senior Executives and colleagues in other ONS business areas.
4. Works closely with business areas to identify, assess and commission new rich-content projects.
5. Provides, vision, guidance and editorial approval for new projects based on a continual understanding of user needs and expectations.
6. Develops and manages an ongoing portfolio of innovative content, maximising impact and value for money.
7. Builds effective partnerships with media to increase outreach and engagement with ONS content.
8. Establishes best practice in creation of rich content for the web and other channels, and works to improve practice and capability throughout ONS.
- From December 2010, a short term contract at the BBC for a data journalist:
The team is looking for a creative, tech-savvy data journalist (computer-assisted reporter) to join its website specials team to work with our online journalists, graphic designer and development teams.
Role Purpose and Aims
You will be required to humanize statistics; to make sense of potentially complicated data and present it in a user friendly format.
You will be asked to focus on a range of data-rich subjects relating to long-term projects or high impact daily new stories, in line with Global News editorial priorities. These could include the following: reports on development, global poverty, Afghanistan casualties, internet connectivity around the world, or global recession figures.
Key Knowledge and Experience
You will be a self-starter, brimming with story ideas who is comfortable with statistics and has the expertise to delve beneath the headline figures and explain the fuller picture.
You will have significant journalistic experience gained ideally from working in an international news environment.
The successful candidate should have experience (or at least awareness) of visualising data and visualisation tools.
You should be excited about developing the way that data is interpreted and presented on the web, from heavy number crunching, to dynamic mapping and interactive graphics. You must have demonstrated knowledge of statistics, statistical analysis, with a good understanding of the range and breadth of data sources in the UK and internationally, broad experience with data sources, data mining and have good visual and statistical skills.
You must have a Computer-assisted reporting background or similar, including a good knowledge of the relevant software (including Excel and mapping software).
Experience of producing and developing data driven web content a senior level within time and budget constraints.
Central to the role is an ability to analyse complicated information and present it to our readers in a way that is visually engaging and easy to understand, using a range of web-based technologies, for which you should have familiarity with database interfaces and web presentation layers, as well as database concepting, content entry and management.
You will be expected to have your own original ideas on how to best apply data driven journalism, either to complement stories when appropriate or to identify potential original stories while interpreting data, researching and investigating them, crunching the data yourself and working with designers and developers on creating content that will engage our audience, and provide them with useful, personalised information.
FWIW, it’s probably worth remembering that the use of data is not necessarily a new thing.. for example, this post – The myth of the missing Data Scientist – does a good job debunking some of the myths around “data science”.
Following the official opening of the Open Data Institute (ODI) last week, a flurry of data related announcements this week:
- A big one for stats fans with the release of 2011 Census data by the ONS: 2011 Census, Key Statistics for Local Authorities in England and Wales. A few charts appear to have made it into the mix (along with the data to generate them), which I guess sets the baseline for whoever lands the currently advertised Head of Rich Content at the ONS job…
The data files associated with press releases are published as Excel spreadsheets. I guess this reflects, in part, the need to come up with a container that can cope with all the metadata. It’s a bit of a pain, though. One thing I keep meaning to explore further are ways of bundling data in R packages, along with scripts for analysing and visualising the data so bundled (eg US Census Spatial and Demographic Data in R: The UScensus2000 Suite of Packages or US consumer expenditure survey (ce) in R). I probably should also look again at Google’s Dataset Publication Language (DSPL) as well as other packaging formats. I need to check out the latest major release from the W3C Provenance Working Group too…
- Over at BIS, £8 million of investment in open public data is announced, the major chunk of which goes to the Data Strategy Board (#datastrategy) Breakthrough Fund to help public bodies get over short term technical barriers to releasing open public data. I keep wittering on about mapping out data flows that already exist and then finding ways to tap into them directly, so won’t repeat that here;-) A smaller pot, administered by the ODI, will be available to SMEs via the Open Data Immersion Programme. Also announced, the Ordnance Survey will be widening the availability of its range of mapping data.
- Not sure if I missed this when it was presumably announced? The Data Strategy Board’s chair Stephan Shakespeare (CEO of YouGov Plc) is leading an independent review of public sector information (here are the (draft) terms of reference). I’m not sure how this review fits into the reports to the tangle of reporting lines associated with the Data Strategy Board and the Public Data Group (the latter seems to have been very quiet?). I also wonder where the ODI fits into that whole structure?
- The funding around public open data coincided with a written Ministerial statement form the Cabinet Office that provided an Update on Departmental Open Data Commitments and adherence to Public Data Principles (>original link on a gov.uk domain, h/t @owenboswarva). The update is spectacularly lacking in linking to any of the raw data that is summarised in the actual statement, so so much for any actual transparency there… The same minister, Francis Maude, has also been fulfilling his social media obligations with a piece in the Huffington Post on A Practical Vision for Open Government. (In other news, at the micro/pragmatic level of open public data, I’m still finding that week on week releases of NHS sitrep data show minor differences in formatting and occasional errors…)
Things have been moving on the Communications Data front too. Communications Data got a look in as part of the 2011/2012 Security and Intelligence Committee Annual Report with a review of what’s currently possible and “why change may be necessary”. Apparently:
118. The changes in the telecommunications industry, and the methods being used by people to communicate, have resulted in the erosion of the ability of the police and Agencies to access the information they require to conduct their investigations. Historically, prior to the introduction of mobile telephones, the police and Agencies could access (via CSPs, when appropriately authorised) the communications data they required, which was carried exclusively across the fixed-line telephone network. With the move to mobile and now internet-based telephony, this access has declined: the Home Office has estimated that, at present, the police and Agencies can access only 75% of the communications data that they would wish, and it is predicted that this will significantly decline over the next few years if no action is taken. Clearly, this is of concern to the police and intelligence and security Agencies as it could significantly impact their ability to investigate the most serious of criminal offences.
N. The transition to internet-based communication, and the emergence of social networking and instant messaging, have transformed the way people communicate. The current legislative framework – which already allows the police and intelligence and security Agencies to access this material under tightly defined circumstances – does not cover these new forms of communication. [original emphasis]
Elsewhere in Parliament, the Joint Select Committee Report on the Draft Communications Data Bill was published and took a critical tone (Home Secretary should not be given carte blanche to order retention of any type of data under draft communications data bill, says joint committee. “There needs to be some substantial re-writing of the Bill before it is brought before Parliament” adds Lord Blencathra, Chair of the Joint Committee.) Friend and colleague Ray Corrigan links to some of the press reviews of the report here: Joint Committee declare CDB unworkable.
In other news, Prime Minister David Cameron’s announcement of DNA tests to revolutionise fight against cancer and help 100,000 patients was reported via a technology angle – Everybody’s DNA could be on genetic map in ‘very near future’ [Daily Telegraph] – as well as by means of more reactionary headlines: Plans for NHS database of patients’ DNA angers privacy campaigners [Guardian], Privacy fears over DNA database for up to 100,000 patients [Daily Telegraph].
If DNA is your thing, don’t forget that the Home Office already operates a National DNA Database for law enforcement purposes.
And if national databases are your thing, there always the National Pupil Database which was in the news recently with the launch of a consultation on proposed amendments to individual pupil information prescribed persons regulations which seeks to “maximise the value of this rich dataset” by widening access to this data. (Again, Ray provides some context and commentary: Mr Gove touting access to National Pupil Database.)
PS A late inclusion: DECC announcement around smart meter rollout with some potential links to #midata strategy (eg “suppliers will not be able to use energy consumption data for marketing purposes unless they have explicit consent”). A whole raft of consultations were held around smart metering and Govenerment responses are also published today, including Government Response on Data Access and Privacy Framework, the Smart Metering Privacy Impact Assessment and a report on public attitudes research around smart metering. I also spotted an earlier consultation that had passed me by around the Data and Communications Company (DCC) License Conditions; here the response, which opens with: “The communications and data transfer and management required to support smart metering is to be organised by a new central communications body – the Data and Communications Company (“the DCC”). The DCC will be a new licensed entity regulated by the Gas and Electricity Markets Authority (otherwise referred to as “the Authority”, or “Ofgem”). A single organisation will be granted a licence under each of the Electricity and Gas Acts (there will be two licences in a single document, referred to as the “DCC Licence”) to provide these services within the domestic sector throughout Great Britain”. Another one to put on the reading pile…
Putting a big brother watch hat on, the notion of “meter surveillance” brings to mind BBC article about an upcoming (will hopefully thence be persistently available on iPlayer?) radio programme on “Electric Network Frequency (ENF) analysis”, The hum that helps to fight crime. According to Wikipedia, ENF is a forensic science technique for validating audio recordings by comparing frequency changes in background mains hum in the recording with long-term high-precision historical records of mains frequency changes from a database. In turn, this reminds me of appliance signature detection (identifying what appliance is switched on or off from its electrical load curve signature), for example Leveraging smart meter data to recognize home appliances. In context of audio surveillance, how about supplementing surveillance video cameras with microphones? Public Buses Across Country [US] Quietly Adding Microphones to Record Passenger Conversations.