Time for data.ac.uk? Or a local data.open.ac.uk?

Over a swift half at the end of the rather wonderful Liver’n’Mash, Mike Nolan chatted through some of his thoughts around his presentation on data.ac.uk that I hadn’t been able to get to see.

From looking back over Mike’s presentation slides, he seems to be advocating “let’s do what we can, as soon as possible”, particularly with respect to sharing things like information about courses and events, as well as other data from university information systems. That emphasis seems to me to be on syndicating information already available on university websites in a more data-like way (RSS news feeds, for example, or calendar feeds); this is similar to the approach taken by the new DirectGov API, I think?

(By the by, I came across this related presentation earlier today, something I’d prepared for the SocialLearn project,as was, as couple of years ago: Portable Course Data.)

As part of the same conversation, Brian Kelly suggested that just as the open data lobby had been calling for government to open up it’s data, government might well respond by calling up public sector organisations to open up their data. This has already started to happen, for example with a letter from Downing Street last week calling on local councils to get ready to open up some of their financial and organisational chart data.

I think Brian is right in suggesting that Higher Education should brace itself to expect similar treatment… (A lot of this data is already out there, it has to be said. For example, here’s a spreadsheet detailing VCs’ pay.)

So what is my take on how to get started with data.ac.uk, or a more local version, such as data.open.ac.uk?

To my mind, the quickest start is to just republish data that is already available in data form. So for example:

– student satisfaction data is available from the Direct Gov Unistats service (OU data [XLS]; general download list);
– funding data about current grants is provided on research council sites. The EPSRC, for example, provide a way of accessing spreadsheets for funding received by various OU departments: OU Awards from the EPSRC (see more generally the full list of funded organisations; (if you know similar ways of getting similar data from other research councils, or funders such as JISC, please post a link in the comments to this post:-)
– financial data, where already published; the OU’s public financial statements can be found on the Freedom of Information minisite, for example (OU FOI: financial statements);
– organisational data, where already published. Again the OU seems to be ahead of the game on this one via the FOI site: OU FOI: organisational structure; (the FOI site also includes pay grade details, so you’ll be able to see just how overpaid I really am, despite all my wittering;-)
– RAE (Research Assessment Exercise) data: one possible source of this information is the Guardian DataStore (Guardian datastore: RAE data, original data from rae.ac.uk [XLS]).

(From that quick list, the OU seems to be doing really well via the OU FOI website. Are other HEIs as far on as this, I wonder, or does having Open in the university name create raised expectations around the OU on matters such as this?!)

The Guardian has also republished quite a range of additional HE related data in its datastore, some of which I’ve even played with before… e.g. Does Funding Equal Happiness in Higher Education? (though there have been one or two, err, niggles with the data… in previous spreadsheets;-) or for a fuller list: OUseful visualisations around education data.

Another possible source of data in a raw form is from the data.gov.uk education datastore (an example can be found via here, which makes me wonder about the extent to which a data.ac.uk website might just be an HE/FE view over that wider datastore? (Related: @kitwallace on University data.) And then maybe, hence: would data.*.ac.uk be a view over data.ac.uk for a particular institution. Or *.sch.ac.uk a view over a data.sch.ac.uk view over the full education datastore?

As to how best to publish the data? That’ll probably take another post, though a really quick win could be achieved by just grabbing the appropriate data from a Guardian datastore spreadsheet on Google docs, putting it into another Google doc, and then just embedding it in a page…;-)

PS In his post, Mike mentioned an old hack of mine that searched for autodiscoverable RSS feeds on *.ac.uk websites. I’d also done one that puts up screenshots of 404 pages… Maybe I need one that looks for the existence of data.*.ac.uk subdomains?!

PPS Finally, it’s probably worth just paying heed to notions of Good and bad Trasnparency. The line I’m suggesting above is one of convenient discovery as much as anything else, pulling (links to) all the data sets related to an institution into an area of the institution’s own website. Cf. the similar approach taken by data.gov.uk, which is to act primarily as a directory layer, as well as hosting national level datastores for particular datasets.


