Category: Open Data

Revisiting Diabetes Prescribing Data

Last year, I had a quick dabble with creating Data Driven Press Releases From HSCIC Data based around a diabetes prescribing data release. Noticing that the figures for 2015/16 had been released over the summer break, I revisited last year’s script to see it if works with this years data – which is does, save for a few tweaks to the import of the spreadsheets.

So this year’s report for the Isle of Wight is as follows, constructed automatically from the data by looking across several sheets in two spreadsheets:

Figures recently published by the HSCIC for the NHS Isle Of Wight CCG show that for the reporting period 2015/16, the total Net Ingredient Costs (NIC) for prescribed diabetes drugs was £2,579,592.41, representing 9.95% of overall Net Ingredient Costs. The NHS Isle Of Wight CCG prescribed 137,741 diabetes drugs items, representing 4.28% of all prescribed items. The average net ingredient cost (NIC) was £18.73 per item. This compares to 4.14% of items (10.31% of total NIC) in the Wessex (Q70) region and 4.61% of items (10.57% of total NIC) in England.

Of the total diabetes drugs prescribed, Insulins accounted for 21,884 items at a total NIC of £1,071,880.50 (£48.98 per item (on average), 0.68% of overall prescriptions, 4.13% of total NIC) and Antidiabetic Drugs accounted for 94,347 items at a total NIC of £890,424.84 (£9.44 per item (on average), 2.93% of overall prescriptions, 3.43% of total NIC). Diagnostic and monitoring devices accounted for 20,485 items at a total NIC of £605,971.30 (£29.58 per item (on average), 0.64% of overall prescriptions, 2.34% of total NIC).

For the NHS ISLE OF WIGHT CCG, the NIC in 2015/16 per patient on the QOF diabetes register in 2014/15 was £330.42. The QOF prevalence of diabetes, aged 17+, for the NHS ISLE OF WIGHT CCG in 2014/15 was 6.61%. This compares to a prevalence rate of 6.37% in Wessex and 5.83% across England.

Creating reports for other CCGs is simply a matter of changing the CCG code. On the to do list is pull dat from last year as well as this year into a simple database, and then write some more sentence templates that compare the year on year performance. (I also really need to have a think about a more sensible way of generating sentences!)

After an interesting chat last night with Gary Warner from Island based Pinnacle Health Partnership, a social enterprise providing backend services for community pharmacies, I thought I’d have a poke around what services are out there based on the NHS open prescribing data, as a short cut to population my own databases with the original data (each month’s dataset comes in around 1GB). seems really useful in this respect, at least as a quick way in to summaries of the data – the API allows you to pull down data by CCG, as well as GP practices within a CCG, and break out prescriptions by item using BNF codes. A handy look up service also helps find items by BNF section, such as Drugs used in diabetes (BNF 6.1).

I’ve posted a quick sketch notebook as a gist, also embedded below.  But here’s a quick glimpse at some of the first reports I had a look at generating. For example, we can look at the spend associated with particular BNF section codes broken down by GP practices in a particular CCG area:


and then group it by period (I made a crude guess at the financial year but I’m not sure what the dates in openprescribing actually relate to…) so the aggregates are indicative only. The aggregate value over the CCG for item counts seems to be broadly in line with the NHS Diabetes Prescribing report, which I took as weak confirmation that it’s sort of working!


We can also use the API to look up items by BNF section:


If we loop round the items in a BNF subsection, we can generate reports about the prescribing of particular items within that subsection across practices, merging in the item names to make them easier to identify:


The pandas charting tools aren’t brilliant – after the Yhat refresh of ggplot (for python), I think I need to revisit that library when tinkering in the python context – but we can do crude sketches quite easily.


Anyway, playtime over. It was interesting to give the openprescribing,net a go, and give a chance I’ll try to play with it a bit more to explore some more quick reports to add to the diabetes notebook.



Bands Incorporated

A few weeks ago, as I was doodling with some Companies House director network mapping code and simple Companies House chatbot ideas, I tweeted an example of Iron Maiden’s company structure based on co-director relationships. Depending on the original search is seeded, the maps may also includes elements of band members’ own personal holdings/interests. The following map, for example, is seeded just from the Iron Maiden LLP company number:


If you know anything about the band, you’ll know Bruce Dickinson’s aircraft interests make complete sense…

That graph is actually a bipartite graph – nodes are either directors or companies. We can easily generate a projection of the graph that replaces directors that link companies by edges that represent “common director” links between companies:


(The edges are actually weighted, so the greater the edge weight, the more directors there are in common between the linked companies.)

In today’s Guardian, I notice they’re running a story about Radiohead’s company structure, with a parallel online piece, Radiohead’s corporate empire: inside the band’s dollars and cents which shows how to get a story out of such a map, as well as how to re-present the original raw map to provide to a bit more spatial semantic structure to it:


(The story also digs into the financial reports from some of the companies.)

By way of comparison, here’s my raw map of Radiohead’s current company structure, generated from Companies House data seeded on the company number for Radiohead Trademark:


It’s easy enough to grab the data for other bands. So how about someone like The Who? If we look in the immediate vicinity of The Who Group, we see core interests:


But if we look for linkage to the next level of co-director links, we start to see other corporate groups that hold some at least one shared interest with the band members:


So what other bands incorporated in the UK might be worth mapping?

Want to Get Started With Open Data? Looking for an Introductory Programming Course?

Want to learn to code but never got round to it? The next presentation of OUr FutureLearn course Learn to Code for Data Analysis will teach you how to write you own programme code, a line a time, to analyse real open data datasets. The next presentation starts on 6 June, 2016, and runs for 4 weeks, and takes about 5 hrs per week.

I’ve often thought that there are several obstacles to getting started with programming. Firstly, there’s the rationale or context: why bother/what could I possibly use programming for? Secondly, there are the practical difficulties: to write and execute programmes, you need to get an programming environment set up. Thirdly, there’s the so what: “okay, so I can programme now, but how do I use this in the real world?”

Many introductory programming courses reuse educational methods and motivational techniques or contexts developed to teach children (and often very young children) the basics of computer programming to set the scene: programming a “turtle” that can drive around the screen, for example, or garishly coloured visual programming environments that let you plug logical blocks together as if they were computational Lego. Great fun, and one way of demonstrating some of the programming principles common to all programming languages, but they don’t necessarily set you up for seeing how such techniques might be directly relevant to an IT problem or issue you face in your daily life. And it can be hard to see how you might use such environments or techniques at work to help you get perform real tasks… (Because programmes can actually be good at that – automating the repetitive and working through large amounts of stuff on your behalf.) At the other extreme are professional programming environments, like geekily bloated versions of Microsoft Word or Excel, with confusing preference setups and menus and settings all over the place. And designed by hardcore programmers for hardcore programmers.

So the approach we’ve taken in the OU FutureLearn course Learn to Code for Data Analysis is slightly different to that.

The course uses a notebook style programming environment that blends text, programme code, and the outputs of running that code (such as charts and tables) in a single, editable web page accessed via your web browser.


To motivate your learning, we use real world, openly licensed data sets from organisations such as the World Bank and the United Nations – data you can download and access for yourself – that you can analyse and chart using your own programme code. A line at a time. Because each line does it’s own thing, each line is useful, and you can see what each line does to your dataset directly.

So that’s the rationale: learn to code so you can work with data (and that includes datasets much larger than you can load into Excel…)

The practicalities of setting up the notebook environment still have to be negotiated, of course. But we try to help you there too. If you want to download and install the programming environment on your computer, you can do, in the form of the freely available Anaconda Scientific Computing Python Distribution. Or you can access an online versions of the notebook based programming environment via SageMathCloud and do all your programming online, through your browser.

So that’s the practical issues hopefully sorted.

But what about the “so what”? Well, the language you’ll be learning is Python, a widely used language programming language that makes it ridiculously easy to do powerful things.

Pyython cartoon - via

But not that easy, perhaps..?!

The environment you’ll be using – Jupyter notebooks – is also a “real world” technology, inspired as an open source platform for scientific computing but increasingly being used by journalists (data journalism, anyone?) and educators. It’s also attracted the attention of business, with companies such as IBM supporting the development of a range of interactive dashboard tools and backend service hooks that allow programmes written using the notebooks to be deployed as standalone online interactive dashboards.

The course won’t take you quite that far, but it will get you started, and safe in the knowledge that whatever you learn, as well as the environment you’re learning in, can be used directly to support your own data analysis activities at work, or at home as a civically minded open data armchair analyst.

So what are you waiting for? Sign up now and I’ll see you in the comments:-)

Trawling the Companies House API to Generate Co-Director Networks

Somewhen ago (it’s always somewhen ago; most of the world never seems to catch up with what’s already happened!:-( I started dabbling with the OpenCorporates API to generate co-director corporate maps that showed companies linked by multiple directors. It must have been a bad idea because no-one could see any point in it, not even interestingness…  (Which suggests to me that boards made up of directors are similarly meaningless? In which case, how are companies supposed to hold themselves to account?)

I tend to disagree. If I hadn’t been looking at connected companies around food processing firms, I would never have learned that one that meat processors cope with animal fat waste is to feed it into the biodiesel raw material supply chain.

Anyway, if we ever get to see a beneficial ownership register, a similar approach should work to generate maps showing how companies sharing beneficial owners are linked. (The same approach also drives my emergent social positioning Twitter maps and the Wikipedia semantic maps I posted about again recently.)

As a possible precursor to that, I thought I’d try to reimplement the code (in part to see if a better approach came to mind) using data grabbed directly from Companies House via their API. I’d already started dabbling with the API (Chat Sketches with the Companies House API) so it didn’t take much more to get a grapher going…

But first, I realise in that earlier post I’d missed the function for actually calling the API – so here it is:

import urllib2, base64, json
from urllib import urlencode
from time import sleep

def url_nice_req(url,t=300):
        return urllib2.urlopen(url)
    except HTTPError, e:
        if e.code == 429:
            print("Overloaded API, resting for a bit...")
            return url_req(url)

#Inspired by
def ch_request(CH_API_TOKEN,url,args=None):
    if args is not None:
    request = urllib2.Request(url)
    # You need the replace to handle encodestring adding a trailing newline 
    # (
    base64string = base64.encodestring('%s:' % (CH_API_TOKEN)).replace('\n', '')
    request.add_header("Authorization", "Basic %s" % base64string)   
    result = url_nice_req(request)

    return json.loads(


In the original implementation, I stored the incremental search results in a dict; in the reimplementation, I thought I’d make use of a small SQLite database.

import sqlite3
if db in locals():
db = sqlite3.connect(tmpDB)
c = db.cursor()

for drop in ['directorslite','companieslite','codirs','coredirs','singlecos']:
    c.execute('''drop table if exists {}'''.format(drop))
c.execute('''create table directorslite
         (dirnum text primary key,
          dirdob integer,
          dirname text)''')

c.execute('''create table companieslite
         (conum text primary key,
          costatus text,
          coname text)''')

c.execute('''create table codirs
         (conum text,
          dirnum text,
          typ text,
          status text)''')

c.execute('''create table coredirs
         (dirnum text)''')

c.execute('''create table singlecos
         (conum text,
          coname text)''')


The code itself runs in two passes. The first pass builds up a seed set of directors from a single company or set of companies using a simple harvester:

def updateOnCo(seed,typ='current',role='director'):
    print('harvesting {}'.format(seed))
          'dirdob':p['date_of_birth']['year'] if 'date_of_birth' in p else None,
          'dirname':p['name']} for p in o]
    for y in x:
        if y['dirnum'] not in dirsdone:
        if isinstance(z, dict): z=[z]
    print('Adding {} directors'.format(len(z)))
    c.executemany('INSERT INTO directorslite (dirnum, dirdob,dirname)'
                     'VALUES (:dirnum,:dirdob,:dirname)', z)
    for oo in [i for i in o if i['links']['officer']['appointments'].strip('/').split('/')[1] not in dirsparsed]:
        print('New director: {}'.format(oid))
        #Play nice with the api
        #add company details
          'costatus':p['appointed_to']['company_status'] if 'company_status' in p['appointed_to'] else '',
          'coname':p['appointed_to']['company_name'] if 'company_name' in p['appointed_to'] else ''} for p in ooo['items']]
        for y in x:
            if y['conum'] not in cosdone:
        if isinstance(z, dict): z=[z]
        print('Adding {} companies'.format(len(z)))
        c.executemany('INSERT INTO companieslite (conum, costatus,coname)'
                     'VALUES (:conum,:costatus,:coname)', z)
        for i in x:cosdone.append(i['conum'])
        #add company director links
            'typ':'current','status':'director'} for p in ooo['items']]
        c.executemany('INSERT INTO codirs (conum, dirnum,typ,status)'
                     'VALUES (:conum,:dirnum,:typ,:status)', x)
        print('Adding {} company-directorships'.format(len(x)))

The set of seed companies may be companies associated with one or more specified seed directors, for example:

def dirCoSeeds(dirseeds,typ='all',role='all'):
    ''' Find companies associated with dirseeds '''
    for d in dirseeds:
        for c in ch_getAppointments(d,typ=typ,role=role)['items']:
    return coseeds

for d in ch_searchOfficers('Bernard Ecclestone',n=10,exact='forename')['items']:

Then I call a first pass of the co-directed companies search with the set of company seeds:

#Need to handle director or LLP Designated Member
for seed in coseeds:
c.executemany('INSERT INTO coredirs (dirnum) VALUES (?)', [[d] for d in dirsparsed])

seeder_roles=['Finance Director']
#for dirs in seeded_cos, if dir_role is in seeder_roles then do a second seeding based on their companies


Then we go for a crawl for as many steps as required… The approach I’ve taken here is to search through the current database to find the companies heuristically defined as codirected, and then feed these back into the harvester.

while depth<maxdepth:
    print('---------------\nFilling out level - {}...'.format(depth))
    if seeder and depth==0:
        #Another policy would be dive on all companies associated w/ dirs of seed
        #In which case set the above test to depth==0
        tofetch=[u[0] for u in c.execute(''' SELECT DISTINCT conum from codirs''')]
        duals=c.execute('''SELECT cd1.conum as c1,cd2.conum as c2, count(*) FROM codirs AS cd1
                        LEFT JOIN codirs AS cd2 
                        ON cd1.dirnum = cd2.dirnum AND cd1.dirnum
                        WHERE cd1.conum < cd2.conum GROUP BY c1,c2 HAVING COUNT(*)>1
        tofetch=[x for t in duals for x in t[:2]]
        #The above has some issues. eg only 1 director is required, and secretary IDs are unique to company
        #Maybe need to change logic so if two directors OR company just has one director?
        #if relaxed>0:
        #    print('Being relaxed {} at depth {}...'.format(relaxed,depth))
        #    duals=c.execute('''SELECT cd.conum as c1,cl.coname as cn, count(*) FROM codirs as cd JOIN companieslite as cl 
        #                 WHERE cd.conum= cl.conum GROUP BY c1,cn HAVING COUNT(*)=1
        #                ''')
        #    tofetch=tofetch+[x[0] for x in duals]
        #    relaxed=relaxed-1
    if depth==0 and oneDirSeed:
        #add in companies with a single director first time round
        for u in c.execute('''SELECT DISTINCT cd.conum, cl.coname FROM codirs cd  JOIN companieslite cl ON
            if len(o['items'])==1 or u[0]in coseeds:
        c.executemany('INSERT INTO singlecos (conum,coname) VALUES (:conum,:coname)', sco)
    #TO DO: Another stategy might to to try to find the Finance Director or other names role and seed from them?
    #Get undone companies
    print('To fetch: ',[u for u in tofetch if u not in cosparsed])
    for u in [x for x in tofetch if x not in cosparsed]:
            #play nice
    #Parse companies

To visualise the data, I opted for Gephi, which meant having to export the data. I started off with a simple CSV edgelist exporter:

data=c.execute('''SELECT cl1.coname as Source,cl2.coname as Target, count(*) FROM codirs AS cd1
                        LEFT JOIN codirs AS cd2 JOIN companieslite as cl1 JOIN companieslite as cl2
                        ON cd1.dirnum = cd2.dirnum and cd1.conum=cl1.conum and cd2.conum=cl2.conum
                        WHERE cd1.conum 1''')
import csv
with open('output1.csv', 'wb') as f:
    writer = csv.writer(f)
    writer.writerow(['Source', 'Target'])
data= c.execute('''SELECT cl1.coname as c1,cl2.coname as c2 FROM codirs AS cd1
                        LEFT JOIN codirs AS cd2 JOIN singlecos as cl1 JOIN singlecos as cl2
                        ON cd1.dirnum = cd2.dirnum and cd1.conum=cl1.conum and cd2.conum=cl2.conum
                        WHERE cd1.conum &lt; cd2.conum''')
with open('output1.csv', 'ab') as f:
    writer = csv.writer(f)

but soon changed that to a proper graph file export, based on a graph built around the codirected companies using the networkx package:

import networkx as nx


data=c.execute('''SELECT cl.conum as cid, cl.coname as cn, dl.dirnum as did, dl.dirname as dn
FROM codirs AS cd JOIN companieslite as cl JOIN directorslite as dl ON cd.dirnum = dl.dirnum and cd.conum=cl.conum ''')
for d in data:
    G.add_node(d[0], Label=d[1])
    G.add_node(d[2], Label=d[3])
nx.write_gexf(G, "test.gexf")

I then load the graph file into Gephi to visualise the data.

Here’s an example of the sort of thing we can get out for a search seeded on companies associated with the Bernie Ecclestone who directs at least one F1 related company:


On the to do list is to automate this a little bit more by adding some netwrok statistics, and possibly a first pass layout, in the networkx step.

In terms of time required to collect the data, the ,a href=””>Companies House API is rate limited to allow 600 requests within a five minute period. Many company networks can be mapped within the 600 call limit, but even for larger networks, the trawl doesn’t take too long even if two or three rest periods are required.

Chat Sketches with the Companies House API, Before the F***kWit UKGov Sell It Off

Ranty title a gut reaction response to news that the Land Registry faces privatisation.

Sketching around similar ideas to my Slack/slash conversational autoresponder around the Parliament data platform API, I thought I’d have a quick play with the UK Companies House API, which provides a simple interface to company registration data, director information and disqualified director information.

Bulk downloads are available for company registration information (here’s a quick howto about working with it; I’ll post a howto showing how to work with it using a containerised database when I get a chance, but for now, here are some clues) and from the API developer forums it looks as if bulk director’s information is available by request.

Working with your own bulk copy of the data is preferable, because it means you can write your own arbitrarily complex queries over any or all of the columns. The Companies House API, on the other hand, gives you a weak search over company and directors names, and the ability to look up individual known records. To do any sort of processing, you need to grab a large amount of search data, and/or make lots of individual known item records to build you own local datastore, and then search or filter across that.

So for example, here’s the first fumblings of my own function for filtering down on a list of officers associated with a particular company based on director role or current status (which I called typ for some reason? Bah:-(:

def ch_getCompanyOfficers(cn,typ='all',role='all'):
    #typ: all, current, previous
    if typ=='current':
        co['items']=[i for i in co['items'] if 'resigned_on' not in i]
        #should possibly check here that len(co['items'])==co['active_count'] ?
    elif typ=='previous':
        co['items']=[i for i in co['items'] if 'resigned_on' in i]
    if role!='all':
        co['items']=[i for i in co['items'] if role==i['officer_role']]
    return co

The next function runs a search over officers by name, but then also lets you filter down the responses to show just those directors who also match a particular search string as part of any company name they are associated with.

def ch_searchOfficers(q,n=50,start_index='',company=''):
    url= ''
    if company != '':
        for p in o['items']:
            p['items'] = [i for i in ch_getAppointments(p['links']['self'])['items'] if company.lower() in i['appointed_to']['company_name'].lower()]
        o['items'] = [i for i in o['items'] if len(i['items'])]
    return o

You get the gist, hopefully. Run a crude API call, and then filter down the result according to particular data properties contained within the search result.

Anyway, as far as the chatting goes, here’s what I’ve started playing around with…

First, let’s just ask what companies a director with a particular name is associated with.


We can take this a bit further by filtering down on the directors associated with a particular company. (Actually, this is simplified now to call the reporting function simply as dirCompanies(c)).


Alternatively, we might try to narrow the search for directors associated with companies in a particular locality. (I’m still trying to get my head round the different logics of this, because companies as well as directors are associated with addresses. I really need to try some specific investigative tasks to get a better feel for how to tune this sort of filter…)


I’ve also started trying to think around the currency of appointments, for example supporting the ability to filter down based on resigned appointments:


Associated with this sort of query (in the sense of exploring the past) are filters that let us search around dissolved companies, for example:


(I should probably also put some time filters in there, for example to search for companies that a particular person was a director of at a particular time…)

We can also search into the disqualified directors register. To try to reduce the sense of turning this into a fishing trip, searching by director name and then filtering by locality feels like it could be handy (though again, this needs some refinement on the way I apply the locality filter.)


Next step on this particular task is to tidy these up a little bit more and then pop them into a Slack responder.

But there are also some other Companies House goodies to come…such as revisiting the notion of co-director based company maps.


Calling an OData Service From Python – UK Parliament Members Data Platform

Whilst having a quick play producing Slack bots and slash commands around the UK Parliament APIs, I noticed (again) that the Members data platform has an OData endpoint.

OData is a data protocol for querying online data services via HTTP requests although it never really seemed to have caught the popular imagination, possibly because Microsoft thought it up, possibly because it seems really fiddly to use…

I had a quick look around for Python client/handler for it, and the closest I came was the pyslet package. I’ve posted a notebook showing my investigations to date here: Handling the UK Parliament Members Data Platform OData Feed, but it seems really clunky and I’m not sure I’ve got it right! (There doesn’t seem to be a lot of tutorial support out there, either?)

Here’s an example of the sort of mess I got myself in:


To make the Parliament OData service more useful needs a higher level Python wrapper, I think, that abstracts a bit further and provides some function calls that make it a tad easier (and natural) to get at the data. Or maybe I need to step back, have a read of the OData blocks, properly get my head around the pyslet OData calls, and try again!

Chatting With ONS Data Via a Simple Slack Bot

A recent post on the ONS Digital blog – Dueling with datasets – describes some of the design decisions taken when putting together the new Office for National Statistics website (such as having a single page for a particular measure that would provide the current figures at the top as well as historical figures further down the page) and some of the challenges still facing the team (such as the language and titling used to describe the statistics).

The emphasis is still very much on publishing the data via a website, however, which promotes two particular sorts of interaction style: browse and search. Via Laura Dewis (Deputy Director, Digital Publishing at Office for National Statistics, and ex- of the OpenLearn parish), I got a peek at some of the popular search terms used on the pre-updated website, which suggest (to me) a mix of vernacular keyword search terms as well as official terms (for example, rpi, baby names, cpi, gdp, retail price index, population, Labour Market Statistics unemployment, inflation, labour force survey).

Over the last couple of years, regular readers will have noticed that I’ve been dabbling with some simple data2text conversions, as well as dipping my toes into some simple custom slackbots (that is, custom slack robots…) capable of responding to simple parameterised queries with texts automatically generated from online data sources (for example, querying the Nomis JSA figures as part of a Slackbot Data Wire, Initial Sketch or my First Steps in a Conversational Slackbot interface to CQC Inspection Data ).

I’m still fumbling around how best to try to put these bots together. On the one hand is trying to work out what sorts of thing we might want to ask of the data, as well as how we might actually ask for it in natural language terms. On the other, is generating queries over the data, and figuring out how to provide the response (creating a canned text around the results from a data query).

But what if there was already a ready source of text interpreting particular datasets that could be used as the response part of a conversational data agent? Then all we’d have to focus on would be parsing queries and matching them to the texts?

A couple of weeks ago, when the new ONS website came out of beta, the human facing web pages were complemented with a data view in the form of JSON feeds that mirrored the HTML text (I don’t know if the HTML is actually generated from the JSON feeds?), as described in More Observations on the ONS JSON Feeds – Returning Bulletin Text as Data. So here we have a ready source of data interpreting text that we may be able to use to provide a backend to a conversational UI to the ONS content. (Whether or not the text is human generated or machine generated is irrelevant – though it does also provide a useful model for developing and testing my own data to text routines!)

So let’s see… it being to wet to go and dig the vegetable patch yesterday, I thought I’d have a quick play trying to put together some simple response rules, in part building on some of the ONS JSON parsing code I started putting together following the ONS website refresh.

Here’s a snapshot of where I’m at…

Firstly, asking for a summary of some popular recent figures:


The latest figures are assumed for some common keyword driven queries. We can also ask for a chart:


The ONS publish different sorts of product that can be filtered against:


So for example, we can run a search to find what bulletins are available on a particular topic:


(For some reason, the markdown isn’t being interpreted as such?)

We can then go on to ask about a particular bulletin, and get the highlights from it:


(I did wonder about numbering the items in the list, retaining the state of the previous response in the bot, and then allowing an interaction along the lines of “tell me more about item 3”?)

We can also ask about other publication types, but I haven’t checked the JSON yet to see whether it makes sense to handle the response from those slightly differently:


At the moment, it’s all a bit Wizard of Oz, but it’s amazing how fluid you can be in writing queries that are matched by some very simple regular expressions:


So not bad for an hour or two’s play… Next steps would require getting a better idea about what sorts of conversation folk might want to have with the data, and what they actually expect to see in return. For example, it would be possible to mix in links to datafiles, or perhaps even upload datafiles to the slack channel?

PS Hmm, thinks.. what would a slack interface to a Jupyter server be like…?