More Observations on the ONS JSON Feeds – Returning Bulletin Text as Data

Whilst starting to sketch out some python functions for grabbing the JSON data feeds from the new ONS website, I also started wondering how I might be able to make use of them in a simple slackbot that could provide a crude conversational interface to some of the ONS stats.

(To this end, it would also be handy to see some ONS search logs to see what sort of things folk search – and how they phrase their searches…)

One of the ways of using the data is as the basis for some simple data2text scripts, that can report the outcomes of some simple canned analyses of the data (comparing the latest figures with those from the previous month, or a year ago, for example). But the ONS also produce commentary on various statistics for via their statistical bulletins – and it seems that these, too, are available in JSON form simply by adding /data to the end of the IRL path as before:

UK_Labour_Market_-_Office_for_National_Statistics

One thing to note is that whist the HTML view of bulletins can include a name element to focus the page on a particular element:

http://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/uklabourmarket/february2016/#comparison-between-unemployment-and-the-claimant-count

the name attribute switch doesn’t work to filter the JSON output to that element (though it would be easy enough to script a JSON handler to return that focus) so there’s no point adding it to the JSON feed URL:

http://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/uklabourmarket/february2016/data

One other thing to note about the JSON feed is that it contains cross-linked elements for items such as charts and tables. If you look closely at the above screenshot, you’ll see it contains a reference to an ons-table.

...
sections: [
...
{
title: "Summary of latest labour market statistics",
markdown: "Table A shows the latest estimates, for October to December 2015, for employment, unemployment and economic inactivity. It shows how these estimates compare with the previous quarter (July to September 2015) and the previous year (October to December 2014). Comparing October to December 2015 with July to September 2015 provides the most robust short-term comparison. Making comparisons with earlier data at Section (ii) has more information. <ons-table path="cea716cc" /> Figure A shows a more detailed breakdown of the labour market for October to December 2015. <ons-image path="718d6bbc" />"
},
...
]
...

This resource is then described in detail elsewhere in the data feed linked by the same ID value:

www_ons_gov_uk_employmentandlabourmarket_peopleinwork_employmentandemployeetypes_bulletins_uklabourmarket_february2016_data_comparison-between-unemployment-and-the-claimant-count

...
tables: [
{
title: "Table A: Summary of UK labour market statistics for October to December 2015, seasonally adjusted",
filename: "cea716cc",
uri: "/employmentandlabourmarket/peopleinwork/employmentandemployeetypes/bulletins/uklabourmarket/february2016/cea716cc"
}
],
...

Images are identified via the ons-image tag, charts via the ons-chart tag, and so on.

So now I’m thinking – maybe this is the place to start thinking about a simple conversational UI? Something that can handle simple references into different parts of a bulletin, and return the ONS text as the response?

One comment