The FOI Route to Real (Fake) Open Data via WhatDoTheyKnow

In FOI Signals on Useful Open Data?, I pondered whether we could make use of information about FOI to help identify what sorts of data folk might actually be interested in by virtue of making Freedom of Information (FOI) requests for that that data.

I couldn’t help but start to try working various elements of that idea through, so here’s a simple baby step to begin with – a scraper on Scraperwiki (Scaperwiki scraper: WhatDOTheyKnow requests) that searches for FOI requests made through WhatDoTheyKnow that got one or more Excel/xls spreasheets back as an attachment.

I’ve also popped up a Scraperwiki view that allows you to view data returning searches made to local councils or universities

Clicking through on an FOI request link takes you to the response that contains the data file, which can be downloaded directly or previewed on Zoho:

It strikes me that if I crawled the response pages, I could build my own index of data files, catalogued according to FOI request titles, in effect generating a “fake” or opendata catalogue as powered by FOI requests…? (What would be really handy in the local council requests would be if the responses were tagged with with appropriate LGSL code or IPSV terms (indexing on the way out) as a form of useful public metadata that can help put the FOI released data to work…?)

Insofar as the requests may or may not be useful as signaling particular topic areas as good candidates as “standard” open data releases, I still need to do some text analysis on the request titles. In the meantime, you can enter a keyword/key phrase in the Request text box in order to filter the table results to only show requests whose title contains the keyword/phrase. (The Council drop down list allows you to filter the table so that it only shows requests for a particular university/council.)

PS via a post on HelpMeInvestigate, I came across this list of FOI responses to requests made to the NHS Prescription Pricing Division. From a quick skim, some of the responses have “data” file attachments, though in the form of PDFs rather than spreadsheets/CSV. However, it would be possible to scrape the pages to at least identify ones that do have attachments (which is a clue they may contain data sets?)

So now I’m wondering – what other bodies produce full lists of FOI requests they have received, along with the responses to them?

PPS See also this search query on FOI Release publications.

Author: Tony Hirst

I'm a lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...