Next Steps Taken for data.ac.uk…

One of the problems with doing “data stuff” in a particular sector is finding data from across the sector. data.ac.uk seeks to help simplify the discovery of (and maybe even normalised access to) data published across the UK Higher Education sector.

data.ac.uk homepage

A new unveiling this week was the HE equipment register (which I think grew out of the Uniquip equipment and facility sharing project?), and which is intended to provide a single point of access for looking up access to research facilities and equipment.

data.ac.uk equipment register

(I think the research councils increasingly require universities to have a plan for making funded research equipment available to businesses, and providing a catalogue to look up such equipment facilitates that.) I’m not sure about the coverage of this catalogue at the moment, or how it relates (not least in a data sharing way) to research equipment sharing consortia such as the N8 Research Partnership (Durham, Lancaster, Leeds, Liverpool, Manchester, Newcastle, Sheffield and York) or the M5 Group (Birmingham, Leicester, Loughborough, Nottingham, Warwick, and Aston). (There’s also fragmentary evidence of an S5 grouping (Cambridge, Imperial, Oxford, UCL and Southampton) but I haven’t found a public website for them?)

Another area of the data.ac.uk provides a handy link to “administrative” information relating to HEIs – Learning Providers data, although I’m not sure to what extent this overlaps with the data contained in the JISC Monitoring Unit (JISC MU) database?

learning providers

One problem with open data sites having national or sector coverage is that, whilst we might hope that individual locations will submit data to the datastore, its often more likely the case that a dataset will need curating and collecting together by a dedicated and interested (obsessive?) individual. To date, Chris Gutteridge has been doing a lot of the work on data.ac.uk, but he doesn’t necessarily scale!;-)

Architecturally, the site is designed to support what I guess we could describe as federated management. Subdomains are used to identify different topic or category areas, with a top bar menu providing navigation to other areas of data.ac.uk. In principle, anyone could propose, host, and curate data from across the sector relating to a particular topic. Unlike sites such as OpenlyLocal, the model does not (yet?) support pages built around the opendata offerings of a particular institution, though I guess someone could generate something like quickdata.ac.uk/university-name to provide a summary page for each university on a “quickdata” subdomain?

There is possibly an issue regarding the “status” of data.ac.uk in respect of the extent to which it provides a single point of access to normalised data within a topic area, compared to linking out to locally hosted versions of data relating to particular items (for example, we might imagine foi.data.ac.uk linking to FOI homepages on each university website, or orgcharts.data.ac.uk linking to data source pages on university websites, ordered by university). On the other hand, there are pre-exisiting “national datasets” such as the data collated by JISC MU, or the research council funding data that looks as if it’ll be made available via the Research Councils UK Gateway to research. For these national colletions, the data.ac.uk model would “allow” for sites like Gateway to Research to take over something like the gtr.data.ac.uk subdomain, and add the data.uk.uk top bar to their site, though I could see all sorts of issues with that relating to perceived ownership! One possible way around this would be to provide a button that “partner” sites could include that would identify a site as being part of the data.ac.uk federation and then popping up the top bar if folk wanted to explore other data.ac.uk federation sites? URLs such as gtr.data.ac.uk would then simply act as redirects into sites with independent branding/look and feel, but a data.ac.uk federation member button on them somewhere?

In other news, the Linked Up Challenge also launched this week “promoting the innovative use of linked and open data in an educational context”.

linkededucation data endpoint

The competition will be making available data drawn from across the European HE sector and published as Linked Data:

linked up challenge

For some reason, this springs to mind…

Hmmm…

Tracking Down Local Government Consultation Web Pages

One of the things I have on my to do list for this year is to try to get a joint paper out with Danilo Rothberg on public consultation platforms at local, national and European level.

In the UK, many local councils have an area of their website dedicated to local consultations, so my first hacky thought for a way to track them down was to scrape something together around a Google search of the form: site:gov.uk intitle:consultation intitle:council.

By chance, I stumbled across page on OpenlyLocal linking to the services offered by a particular council, which made me wonder if I could actually pull down a list of the URLs of consulation pages by council directly from OpenlyLocal.

A quick Twitter exchange with that site’s maestro, Chris Taggart/@countculture, suggested that OpenlyLocal “[s]piders the Localgov redirect urls every week… …trick is knowing the LGD service id code, and then you can get all URLs for councils with URL for it”. In addition, “It’s the OL key you need (that maps to the ldg native uid). Something like this: http://openlylocal.com/services?ldg_service_id=370“.

So, decoding that, and with a bit of extra Googling, here’s where I’m at:

  • from the esd/effective service delivery toolkit (“Facilitated by the Local Government Association (LGA) working for local government improvement so councils can serve people and places better. esd-toolkit is owned and led by the local government sector”), we can find the LGD service ID codes for services relating to consultations:
    • Council – consultation – service delivery (867): All councils are expected to consult on specific areas of their service delivery. This allows service users and other interested parties to have to opportunities to be involved in planning, prioritising and monitoring of services. It also gives customers an opportunity to see all consultation activity, both current and in the past, and a mechanism for customers to research satisfaction with service delivery, opinions about specific projects and looks at lifestyle profiles which helps us design better local services.
    • Council – consultation and community engagement (366): The local authority uses various means to consult and engage with local communities including development of community and citizens’ forums and panels, consultation events, public events, young people’s participation.
    • Council – spending plans – consultation (658): Arrangement of public meetings or other means by which citizens can be consulted on budget plans for the forthcoming year. Previous consultations may be published or available for view on request.
    • Education – consultations (49): The education authority consult with all interested parties (schools, teachers, parents, pupils) on all issues concerning education provision and in particular on any proposed changes to education within schools run by the authority.
    • Equalities and diversity – assessment and consultation (861): The LA is responsible for ensuring that equality and diversity is considered at all times both in employment policy and in the provision of services. Every authority should assess, and consult on, the impact of policy in relation to equality and diversity within their community
    • Planning – consultation (855): The involvement of the public in the planning process. When planning applications are submitted there is a comprehensive system in place which ensures that proposals are publicised in order to invite comments from the local community.
  • To pull down the URL associated with each service for each council from OpenlyLocal (URLs of the form http://openlylocal.com/services?ldg_service_id=370), we need to know the mapping from Local Service ID codes shown above to the corresponding OpenlyLocal service codes (link???)
  • The DirectGov A-Z Directory of Local Services page links to alphabetical listings of service related pages presumably keyed on the Local Gov Service ID (LGSL= in the URL?), though on a quick skim through the listings I couldn’t find any consultation related services? [Ah, I should probably have tried from here: Directgov: Find out about local consultations]
  • From the Local Directgov on the Dept for Communities and Local Government website, I found a newsletter link to Local Directgov: open datasets
  • On data.gov.uk, there’s a handy CSV data file referred to as the Local directgov services list: “This dataset is held on the Local Directgov platform which provides the deep links into Local council websites for a number of services in Directgov. The Local Authority Service details holds the local council URLS for over 240 services where the customer can directly transfer to the appropriate service page on any council in England.” The CSV data is organised as follows:
    Authority Name,SNAC,LAid,Service Name,LGSL,LGIL,Service URL
    ...
    Adur District Council,45UB,1,Find out about local consultations,867,8,http://www.adur.gov.uk/consultation/index.htm
    ...

So, that’s where I’m at… I now have a CSV file from data.gov.uk with a list of deep link URLs in to local gov websites, and a set of Local Gov Service IDs from esd that allow me to identify the links corresponding to various sorts of consultation.

If I run those URLs through an RSS/Atom feed autodiscovery service, how many open/current consultation feeds do you think I’ll find?!

PS One of of the things OpenlyLocal is managing to do is provide an abstraction/normalisation layer over the myriad local council websites. It’s interesting to compare this with the JISC funded Linking You Toolkit that surveyed URL patterns across various UK university websites and made a series of recommendations about a normalised URL scheme that could potentially be used (via URL rewrites) to provide a common URL interface over common areas of UK HE websites (a simplification that I think also fits into the spirit of normalised data presentation approach being taken with the Key Information Sets). It strikes me that an alternative scheme, at least for the purposes of building services that can map from a central service to deep links related to particular services or content areas of a university website, would be to follow the Local Gov Service ID model and come up with a set of university related services or content areas (potentially reusing those identified by the Linking You project), and then request that universities publish site maps relating deeplink URLs to the appropriate identifier.

PPS as to why I bothered with this post: I’m just trying to document/model an example of the sort of search process I go through whenever I try to find anything out… Which as you can see, is still messed up and informal, starting with Google, then moving to tapping folk I suspect might know the answer to questions I’m trying to articulate, and finally ending up by checking out data.gov.uk…

PPPS Given the full list of government consultation websites for departmental and agency consultations, I wonder: is there a service/content area coding scheme used to identify common areas of central gov department websites?

Opening Up University Energy Data

Knowing how much energy a building uses is often the first step towards reducing it’s energy footprint, so here’s a quick round up of the university open data initiatives I know of that are based around energy data. (If you know of more, please let me know via the comments.)

First up, via @lncd, here’s a heatmap showing change in energy building usage compared aross the last two days:

Lincoln - 2-day energy data utilisation comparison

This visulisation (built on top of Lincoln U’s open data feeds (http://data.lincoln.ac.uk/)) charts energy usage over a calendar month:

Lincoln U - energy usage over time

Read more about Lincoln’s energy data hacks here: University of Lincoln Energy Data …. an update!.

Over in Oxford, there’s a tool called OpenMeters that displays charts of energy usage by building:

Oxford opne energy data

The Oxford data is available as Linked Data from http://data.ox.ac.uk/datasets/, err, I think… As ever, it’d probably take me an hour or two to find out how to make my first query that returns anything meaningful to me in a form I could actually do anything with!;-) I also wonder whether the Oxford project feeds into another Oxford initiative: iMeasure?

(By the by, I misread the title of this place from the other place to Oxford as “A Linked Data Model Of Building Energy Consumption”: A Limited-Data Model Of Building Energy Consumption, which explores a model “targeted at practical, wide-scale deployment which produces an ongoing breakdown of building energy consumption”.)

Whilst I don’t think Warwick University’s energy monitoring page is based on exposed open data, it does have some dials/gauges on it:

Warick U - energy monitoring

(By the by – that page doesn’t have a Warwick favicon associated with it in my browser; should it?)

So – a quick round up (and huuuuuuge displacement activity from what I should have been doing this afternoon), but I think it’s worth tracking and reporting on these early demonstrations…

See also: JISC Green ICT projects (which I note don’t seem to return popular Google results for queries relating to university energy data, even when searches are limited to the .ac.uk domain…) and the Greening ICT Programme Community Site; Govspark, which compares energy usage across government departments; Innovations in Campus Mapping for a review of how open data is being used to support enhanced interactive campus maps; and Open Data Powered Location Based Services in UK Higher Education.

PS given Lincoln are publishing all manner of open data, I wonder whether there is enough there to do an ad hoc version of the Heat and light by timetable project using data they’re producing just anyway?!

Open Data Powered Location Based Services in UK Higher Education

One of the Good Things about open data is that with the data being open, there’s less pressure to lock down or restrict access to any of the apps that might build on top it. Whilst some open data initiatives are based around dumping partial, broken or unmaintained datasets “just because”, other open data initiatives are actually using open data as part of a workflow, where the publication of the open data can be seen as opening up a window onto, and tap into, a working data pipeline (e.g. Putting Public Open Data to Work…?, Open Data Processes – Taps, Query Paths/Audit Trails and Round Tripping).

A couple of recent announcements show how universities are start to actually put their open data to work through location based services.

Yesterday(?) saw the appearance of the Southampton OpenData map, as developed by postgrad Colin William et al, using data wrangled into the open by Chris Gutteridge:

The maps taps into several of the Southampton open data sets, and displays information locating places to eat and drink (along with an idea of what’s on offer, as well as opening times), computer access (including what software is available on what machines), and live travel information.

Last week, the OU’s Mathieu d’Aquin and Fouad Zablith’s wayOU – Mobile Location Tracking Using Linked Data/ app took the best demo prize at ESWC2011 (Extended Semantic Web Conference) (paper). This Android app supports check-ins around OU POIs (related: why POIs are more interesting than lat/long):

The code is available from the project’s code repository. (I wonder when we’ll start to see automatic “tracking indoors” using wifi location, cf. Tesco’s In-store ‘Sat-nav’ up and working now in a Tesco branch – come and try it!..?)

So how else is location information and open data being used in UK HEIs? Oxford’s Project Erewhon (review by Scott Wilson) linked into a central service, Oxpoints that provides a location based directory of many of the university’s “entities” (colleges, libraries, museums, buildings, carparks, etc).

Over at Lincoln, ALex Bilbie and Nick Jackson started opening up Lincoln location services last year (I think the services described in that post have moved over to the Lincoln open data site, and as Lincoln’s new facilities software comes online I think we can expect to see a lot more from them…

In Cambridge, the CamLib mobile interface draws on the Cambridge library API to allow you to lookup and locate any of the dozens of Cambridge University libraries (about). (Note to self, I should repair the talks@Cam/keeping up with events app I did whilst on my Arcadia Fellowship).

If you know of any other location based, university open data powered apps or services, please add a link in the comments…

PS The number of UK universities that I’m aware of running opendata* projects is still pretty low (The Open University, Southampton, Lincoln to my knowledge, with a couple of others on hold (e.g. Oxford) My Google search heursitic is: /uni-name/ university open data.

* What does, and what doesn’t, count as “public open data” in the sector is still up for debate… Most universities do (or should) be able to make publicly funded research data public, although at the current time this is still handled in an ad hoc way (universities are still grappling with their open research data policies and repositories). Via several JISC initiatives, several universities have started opening up library catalogue usage data, and activity data from other library and VLE services (for example, Current JISC Projects of Possible Interest to LAK11 Attendees). Some universities have started opening up courses and qualifications catalogue data via XCRI, or via via open data initiatives. One rule of thumb might be: university/institutional open data is data that sits on a self-declared institutional “open data” website hosted on a university’s web domain. That is, “open data” is data that is catalogued by the institution as such, and discoverable as such. I suspect that what universities put on such sites over time will have a common core, with additional datasets of local interest.

Related: campus maps on Google maps, e.g. Loughborough University

Another step on the Road to a Distributed data.ac.uk – Southampton University Linked Open Data

Earlier this week, Chris Gutteridge and Dave Challis pushed data.southampton.ac.uk, Southampton University’s Linked Open Data store (Southampton U Data blog), containing for starters at least the following:

  • place data
  • a (non-authoritative) dataset describing the university’s organisational units
  • Academic programme data; this dataset identifies courses according to UCAS course code and JACS code, as well as remodelled Unistats course data for some of the courses.

From what I can tell, Chris has been running round Southampton grabbing data from wheresoever he can get it, so it’ll be interesting to see how the datasets grow out over the coming months;-)

Here’s how I think part of the data looks at the moment?

graph soton {
"Programmes(2010-2011session)"--"OpenDataCatalog";
"JACSCodes"--"OpenDataCatalog";
"JACSCodes"--"StudentStatistics";
"StudentStatistics"--"OpenDataCatalog";
"Programmes(2010-2011session)"--"JACSCodes";
"BuildingsandPlaces"--"OpenDataCatalog";
"PublicPhonebook"--"OpenDataCatalog";
"Organisation"--"OpenDataCatalog";
"PublicPhonebook"--"Organisation";}
}

A full list of datasets can be found here.

I wonder if it would be useful if each institution publishing Linked Open Data published an authoritative, local-to-them version of the Linked Open Data Cloud Diagram showing the local datasets and the third party datasets that are directly linked to? As well as the diagram, a data representation of the diagram (e.g. a Graphviz .dot file, would be handy…)

As a quick way in to writing your own queries on the Southampton open data SPARQL endpoint, previews of the queries used to generate results pages in the data store are also provided:

So for example:

A couple of quick observations about the data:

The organisation data looks quite flat at the moment, but I wonder if more structure will become available over time, allowing an organogram for the university to be generated directly from this data? Whenever I see an organisational chart (such as the Soton Corporate Servies organisation chart, I can’t help feeling it should be generated from an underlying data description, rather than simply presented as a flat image, with the underlying data published alongside the chart, or progressively enhanced to display the chart?) Given the general crapness of institutional search engines, surely we should be able to find a way of using organisational structure and committee workflows to help surface relevant content to folk at a particular location in the organisation/workflow?

The academic/course data is quite thin at the moment, but provides really important piece of scaffolding for linking to ever richer course related content, as well as linking out through services like UCAS. [UPDATE: by the by, @scottbw just created and shared an RDF XCRI vocabulary for course descriptions, for use with this MLO RDFS (I have no idea what any of that means, either;-).]

In the same way that getting access to postcode data and its various associations was foundational for the development of many location based services in the UK, so access to course code data for building course level applications is key.

I thought it was particularly interesting to see a link from courses to data obtained and remodelled from the Unistats service:

As well as the Southampton open data store, the OU is also running a 5 star linked open data service at data.open.ac.uk for OU Linked data, which is currently exposing module information and data around OU podcasts (OU Linked data on OUseful.info). I think location data is also in the store, though not publicly avalaible yet???

One thing that excites me about the opening up of data across sites is the extent to which institutions will start to open up different datasets to other HEIs, and hopefully drive the wider roll out of data as a result as everybody sees what everyone else is opening up… The other thing that excites is being able to join datasets;-)

So, which university will be next?

“Top Level” URL Conventions in Local Council Open Data Websites

A few days ago, I had reason to start pondering URI schemes for open data released by educational institutions. The OU, like a couple of other HEIs, is looking at structuring – and opening up – various sorts of data, and there are also mutterings around what a data.ac.uk styled site might have to offer.

Being a lazy sort, it seems to me that in figuring out how we might collate data from across the ac.uk environment, we could look to the gov.uk environment. So for example, data.gov.uk as a central index over data from both central and local government, which each have their own concerns, and within a type, are likely to share some common features: all local councils will have some of the same sort of data to share, government departments might share some requirements for consistent, centralised reporting (such as website costs and usage) as well their own peculiar data releases, and so on. In the ac.uk context, we have the HEIs (and FE colleges) in one set, research councils and other project related funding bodies in another.

If we look to local council data, we can also spot intermediate layers appearing that apply a canonical structure to a range of variously published data from the local councils. For example, Openly Local is making a play to act as the canonical source for a whole range of local council data across all the UK’s councils; the Police API “allows you to retrieve information about neighbourhood areas in all 43 English & Welsh police forces”, RateMyPlace is a “one stop shop for information on Food Safety Inspections in Staffordshire”, aggregating information from several councils and representing it via a single API, and so on. (For an example of how different councils can publish ostensibly the same data in a wie variety of formats, see Library Location Data on data.gov.uk).

Looking at the list of local councils with open data sites as collected on the OpenlyLocal open data scoreboard (and as extracted from theOpenlyLocal API via a Yahoo Pipe), are any conventions appearing to emerge in the location of local council open data homepages?

http://www.aberdeencity.gov.uk/open_data/open_data_home.asp (Aberdeen City Council)
http://www.bournemouth.gov.uk/Data/ (Bournemouth Borough Council)
http://www.bristol.gov.uk/opendata (Bristol City Council)
http://www.darlington.gov.uk/Generic/Info/opendata.htm (Darlington Borough Council)
http://www.eaststaffsbc.gov.uk/opendata/Pages/default.aspx (East Staffordshire Borough Council)
http://eastsussex.gov.uk/about/standards/opendata.htm (East Sussex County Council)
http://www.eden.gov.uk/about-this-site/open-data/ (Eden District Council)
http://data.london.gov.uk/ (Greater London Authority)
http://picandmix.org.uk/ (Kent County Council)
http://www2.lichfielddc.gov.uk/data/ (Lichfield District Council)
http://data.lincoln.gov.uk/ (Lincoln City Council)
http://www.brent.gov.uk/xml (London Borough of Brent)
http://www.hillingdon.gov.uk/data (London Borough of Hillingdon)
http://www.sutton.gov.uk/index.aspx?articleid=10077 (London Borough of Sutton)
http://www.rbwm.gov.uk/web/transparency.htm (Royal Borough of Windsor and Maidenhead)
http://www.salford.gov.uk/opendata.htm (Salford City Council)
http://www.stratford.gov.uk/opendata (Stratford-on-Avon)
http://www.sunderland.gov.uk/localpublicdata (Sunderland City Council)
http://www.trafford.gov.uk/opendata/ (Trafford Council)
http://opendata.walsall.org.uk/ (Walsall Metropolitan Borough Council)
http://opendata.warwickshire.gov.uk/ (Warwickshire County Council)
http://www.westberks.gov.uk/index.aspx?articleid=20365 (West Berkshire Council)

With only a small number of councils fully engaged, as yet, with open data, no dominant top level naming scheme has yet appeared, although there are a couple of early runners:

As yet, there is no agreement on the following naming approaches:

Several other councils appear to be offering a specific page to handle (at the moment) open data issues (e.g. http://www.salford.gov.uk/opendata.htm or http://www.westberks.gov.uk/index.aspx?articleid=20365), or even separate domains for their data site (e.g. http://picandmix.org.uk/)

Does any of this matter? At the top level, I’m not sure it does, except in setting expectations and providing a sound footing for a scaleable URI scheme. The Cabinet Office Guidance on designing URI sets, which outlines many considerations that need to be taken into account when defining URI schemes particularly for use as identifiers in RDF inspired Linked Data, suggests that domains should “[e]xpect to be maintained in perpetuity” and that “the choice of domain should provide the confidence to the consumer, …, the domain itself … convey[ing] an assurance of quality and longevity.”

In the foreseeable future, I suspect that (pragmatically) it is likely that the majority of data that will be released in the short term will be published as Excel spreadsheets or inforamlly formatted CSV/TSV data, with some sites publishing raw XML. (As Library Location Data on data.gov.uk describes, even when councils ostensibly release the same sort of data, there is no guarantee that they will do it in similar ways: of the 5 councils publishing the locations of local libraries, 5 different data formats were used… ) It is unlikely that councils will be early adopters of Linked Data across the board. (If they were, it might be seen as excluding users in the short term, because while many people are familiar with working with spreadsheets (a widely adopted “end user” technology for people who work with data in their day job), familiar routes in to and out of Linked Data stores are not there yet…) That said, if local councils do end up wanting to publish data with well formed URIs into the Linked Data space, it would be handy if their current URI scheme was designed with that in mind, and in such a way that the minting of future Linked Data URIs isn’t likely to conflict or clash.