FOI Signals on Useful Open Data?
In an Open Data Cities reviewing post (Open data ‘must be driven by need’), Tom Steinberg is quoted as follows:
“A lot of the attitude around open data is what can we give away, what can we give out?” said Steinberg, founder and director of charity mySociety. “Then they say ‘no-one seems to be using it, let’s have a hackday, see if we can create incentives.’ Meanwhile in the freedom of information department there is a pile of requests building up that won’t go away based on real desires – someone really wants to know something.”
The trick for councils will be to train staff right across the authority to spot information requests that could be handled by releasing new types of data, and then to empower someone to help make sure this happens, he said.
Hooking up FOI and open data processes (along with data burden related reporting requirements) seems to be a sensible route to me as a way of trying to identify what data might be usefully opened up as a part of normal workflow.
So how might we go about using FOI signals to identify the sorts of datasets that councils might usefully release?
Defining ‘useful’ is the first step, so for now let’s assume that if someone has gone to the trouble of putting in an FOI request, it’s useful. (If nothing else, publishing the data as part of an opendata process rather than via an FOI request route makes for less work for the FOI department if a similar request is made at a later date, so it may be cost-saving from that point of view too?)
The next step is: where can we find some data relating to FOI requests?
WhatDoTheyKnow, the MySociety site that makes it easy to submit and track FOI requests, seems like a good place to start. To begin with, it’s easy enough to find a list of councils to whom FOI requests have been made via WhatDoTHeyKnow (this information is also available more directly as data by downloading the full list of bodies listed on whatdotheyknow and then extracting items tagged with local_council [A copy of this listing is also available as a db on scraperwiki]).
Interlude: prompted by a comment, here’s a quick poll…
We can now use the unique URL slug/identifier associated with each council to find FOI requests made to that council. Here are a couple of advanced search patterns that may be useful:
- filetype:xls requested_from:kent_county_council – search a particular council for requests that had an Excel spreadsheet file (presumably: some data…) in response
- status:successful requested_from:kent_county_council – search for successfully handled requests
(It might also be worth running requests using the keyword data?)
Results are returned in page lengths of 25 items, so to see all results you need to page through them (using a qualifier of the form &page=N for results page N).
Results include a title/link text that identifies the topic of the FOI request, and a link to the request page itself, which logs all correspondence (and returned files) associated with the request.
(Methinks it would be really handy if the search results were made available as JSON feeds…)
If we scrape the link text of successful requests to all of (or a reasonable sample of) the councils, or grab the subject of requests that returned data files, we should be able to do some simple text analysis that might identify recurring topics that are the subject of requests across several councils. This might help signal the sorts of data that is commonly requested across different councils, as well datasets or information that might be a candidate for opening up as “useful” open data or open information. (Of course, it might not turn up anything of interest at all… But the experiment is quite a quick one to run in a basic form…)
A more laboured approach might be to do text analysis over the text of actual requests (or clarifications), but this would involving scraping the WhatDoTheyKnow site rather more intensively (i.e. grabbing each request page rather than just scraping the search results).
Worth doing, or not? Or maybe someone’s tried this approach already? If so, anyone got a link…?
PS Related: a recent report by the National Audit Office on Government progress on its transparency of public information agenda – NAO: Implementing transparency report (NAO: Press Release – Implementing transparency). The report includes a handy timeline over the last few years capturing notable events in the history of UK open public data (my own, less complete version is here: Open Standards and Open Data ).