A Unit of Comparison for Local Council Budget Consultations, Based on Transparency Spending Data – ASCdays?

A few years ago, via the BBC Radio 4 & World Service programme More or Less (incidentally co-produced by the OU), I came across the notion of miciromorts (Contextualising the Chance of Something Happening – Micromorts), a one in a million chance of death that can be used as a unit of risk to compare various likelihoods of dying. Associated with this measure is David Spiegelhalter’s microlife, “30 minutes of your life expectancy”. The point behind the microlife measure is that it provides a way of comparing life threatening risks based on how much life is likely to be lost, on average, when exposed to such risks.

Life expectancy for a man aged 22 in the UK is currently about 79 years, which is an extra 57 years, or 20,800 days, or 500,000 hours, or 1 million half hours. So, a young man of 22 typically has 1,000,000 half-hours (57 years) ahead of him, the same as a 26 year-old woman. We define a microlife as the result of a chronic risk that reduces life, on average, by just one of the million half hours that they have left.

The idea of micromorts came to mind last night as I was reflecting on a public budget consultation held by the Isle of Wight Council yesterday (a day that also saw the Council’s Leader and Deputy Leader resign their positions). The Council needs to improve budgetary matters by £20 million over the next 3 years, starting with £7.5m in the next financial year. This can come through increasing funding, or cuts. By far the biggest chunk of expenditure by the council, as with all councils, is on adult social care (ASC) [community care statistics /via @jonpoole].

As with every year for the past however many years, I’ve had a vague resolution to do something with local council spending data, and never got very far. Early dabblings with the data that I’ve so far played with this year (and intend to continue…) reinforce the notion that ASC is expensive. Here’s a quick summary of the spending data items for October, 2016:

The spend for each of the directorates was as follows:

  • Adult Services:
    • total spend: £7,746,875.55 (48.33%% of total monthly spend)
    • capital: £395,900.06 (5.11% of directorate monthly spend)
    • revenue: £7,350,975.49 (94.89% of directorate monthly spend)
  • Chief Executive:
    • total spend: £501,021.32 (3.13%% of total monthly spend)
    • capital: £492,507.54 (98.30% of directorate monthly spend)
    • revenue: £8,513.78 (1.70% of directorate monthly spend)
  • Childrens Services:
    • total spend: £2,044,524.26 (12.76%% of total monthly spend)
    • capital: £243,675.08 (11.92% of directorate monthly spend)
    • revenue: £1,800,849.18 (88.08% of directorate monthly spend)
  • Place:
    • total spend: £4,924,117.40 (30.72%% of total monthly spend)
    • capital: £974,024.13 (19.78% of directorate monthly spend)
    • revenue: £3,950,093.27 (80.22% of directorate monthly spend)
  • Public Health:
    • total spend: £434,654.13 (2.71%% of total monthly spend)
    • revenue: £434,654.13 (100.00% of directorate monthly spend)
  • Regeneration:
    • total spend: £57.65 (0.00%% of total monthly spend)
    • revenue: £57.65 (100.00% of directorate monthly spend)
  • Resources:
    • total spend: £377,172.20 (2.35%% of total monthly spend)
    • capital: £20,367.87 (5.40% of directorate monthly spend)
    • revenue: £356,804.33 (94.60% of directorate monthly spend)

Cancelling out Adult Services revenue spend for a month would match the £7.5 million required to make up next year’s funds. That’s unlikely to happen, but it does perhaps hint at a possible unit of comparison when trying to make budget decisions, or at least, support budget consultations.

From my naive perspective, adult social care needs to support a certain number of people, a number that evolves (probably?) in line with demographics. One of the ways people exit care is by dying, though the service is set up to minimise harm and help prolong life. Folk may also be transiently cared for (that is, they enter the care system and then leave it). By looking at the amount spent on adult social care, we can come up with an average cost (mean, median?) per person per day of adult social care – ASCdays. We can reduce the total cost by reducing the amount of time folk spend in the system, either by shortening transient stays or postponing entry into the system.

So what I’ve started wondering is this: as one way of trying to make sense of transparency spending data, is there any use in casting it into equivalent units of ASCdays? If we use ASCday equivalent units, can we take a weak systems view and try to get a feel for whether a cut to a particular service (or improvement of another) can help us get a handle on the ASC expenditure – or whether it might cause problems down the line?

For example, suppose a week’s respite care costs the same as two weeks worth of ASCdays. If that week’s respite care keeps someone out of the adult care service for a month, we’re quids in. If cutting respite care saves 100 ASCdays of funding, but is likely to bring just one person into the care system 3 months early, we might start to doubt whether it will actually lead to any saving at all. (Longer tail saves complicate matters given councils need to balance a budget within a financial year. Spending money this year to save next year requires access to reserves – and confidence in your bet…)

For trying to make budget decisions, or helping engage citizens in budgetary consultations, costing things as per ASCday equivalents, and then trying to come up with some probabilities about the likelihood that a particular cut or expense will result in a certain number of people entering or leaving ASC sooner or later, may help you get a feel for the consequences for a particular action.

As to whether prior probabilities exist around whether cutting this service, or supporting that, are likely to impact on the adult care system, maybe data for that is out there, also?

EU Legal Affairs – Proposals on Civil Law Rules on Robotics

The EU Legal Affairs Committee voted last week to call for legislation that is likely to include a definition of “smart autonomous robots”, regulation of the same, and an ethical code of conduct for designers, producers and users.

Note that the full proposal to be put forward is not currently available – the draft is here: COMMITTEES-JURI-PR-2017-01-12-1095387EN), with amendments.

One thing I noticed in the original document was that it seemed limited in it’s definition of “smart autonomous robots” as physically instantiated things, and ignored more general “smart systems” – AI software systems, for example. An amendment in PE592.405 addressed this, broadening the scope to cover “AI” more generally: smart autonomous robots and their subcategories by taking into consideration the following characteristics of a smart robot and an autonomous system, that comprised a physical support or is connected to a software programme without being embedded in a physical support.

When the topic was originally announced, a big thing was made in the news about the calls for robots to be classed as “electronic persons”. A policy study – European Civil Law Rules In Robotics – that fed into the deliberations attempted to debunk this:

3.1. Incongruity of establishing robots as liable legal persons
The motion for a resolution proposes creating a new category of individual, specifically for robots: electronic persons. Paragraph 31(f) calls upon the European Commission to explore the legal consequences of “creating a specific legal status for robots, so that at least the most sophisticated autonomous robots could be established as having the status of electronic persons with specific rights and obligations, including that of making good any damage they may cause [to third parties], and applying electronic personality to cases where robots make smart autonomous decisions or otherwise interact with third parties”.

When considering civil law in robotics, we should disregard the idea of autonomous robots having a legal personality, for the idea is as unhelpful as it is inappropriate.

Traditionally, when assigning an entity legal personality, we seek to assimilate it to humankind. This is the case with animal rights, with advocates arguing that animals should be assigned a legal personality since some are conscious beings, capable of suffering, etc., and so of feelings which separate them from things. Yet the motion for a resolution does not tie the acceptance of the robot’s legal personality to any potential consciousness. Legal personality is therefore not linked to any regard for the robot’s inner being or feelings, avoiding the questionable assumption that the robot is a conscious being. Assigning robots such personality would, then, meet a simple operational objective arising from the need to make robots liable for their actions.

… the motion for a resolution would appear more inclined to fully erase the human presence. In viewing as an electronic person any “robots [which] make smart autonomous decisions or otherwise interact with third parties” (end of paragraph 31(f)), the motion seems to suggest that the robot itself would be liable and become a legal actor. This analysis finds support in paragraph S, which states that “the more autonomous robots are, the less they can be considered simple tools in the hands of other actors […] [and this] calls for new rules which focus on how a machine can be held — partly or entirely — responsible for its acts or omissions”. Once a robot is no longer controlled by another actor, it becomes the actor itself. Yet how can a mere machine, a carcass devoid of consciousness, feelings, thoughts or its own will, become an autonomous legal actor? How can we even conceive this reality as foreseeable within 10 to 15 years, i.e. within the time frame set in paragraph 25 of the motion for a resolution? From a scientific, legal and even ethical perspective, it is impossible today — and probably will remain so for a long time to come — for a robot to take part in legal life without a human being pulling its strings.

What is more, considering that the main purpose of assigning a robot legal personality would be to make it a liable actor in the event of damage, we should note that other systems would be far more effective at compensating victims; for example, an insurance scheme for autonomous robots, perhaps combined with a compensation fund (paragraphs 31(a) to (e)).

We also have to bear in mind that this status would unavoidably trigger unwanted legal consequences. Paragraph T of the motion states that creating a legal personality would mean that robots’ rights and duties had to be respected. How can we contemplate conferring rights and duties on a mere machine? How could a robot have duties, since this idea is closely linked with human morals? Which rights would we bestow upon a robot: the right to life (i.e. the right to non-destruction), the right to dignity, the right to equality with humankind, the right to retire, the right to receive remuneration (an option explicitly explored in paragraph 31(b) of the motion), etc.? …

In reality, advocates of the legal personality option have a fanciful vision of the robot, inspired by science-fiction novels and cinema. They view the robot — particularly if it is classified as smart and is humanoid — as a genuine thinking artificial creation, humanity’s alter ego. We believe it would be inappropriate and out-of-place not only to recognise the existence of an electronic person but to even create any such legal personality. Doing so risks not only assigning rights and obligations to what is just a tool, but also tearing down the boundaries between man and machine, blurring the lines between the living and the inert, the human and the inhuman. Moreover, creating a new type of person – an electronic person – sends a strong signal which could not only reignite the fear of artificial beings but also call into question Europe’s humanist foundations. Assigning person status to a nonliving, non-conscious entity would therefore be an error since, in the end, humankind would likely be demoted to the rank of a machine. Robots should serve humanity and should have no other role, except in the realms of science-fiction.

Yes but, no but… replace “robot” with “company” in the above. Companies have a legal basis, are defined as legal entities, albeit requiring (in the UK at least), at least one human officer to be responsible for them. To what extent could smart systems be similarly treated, or incorporated?

Or consider this – employees, who have human agency have responsibilities, and liabilities incurred by them may or may not become liabilities for their employer. To the extent that smart systems have “machine agency”, to what extent might they also be classed as employees, eg for insurance or liability purposes? (Related: Fragments – Should Algorithms, Deep Learning AI Models and/or Robots be Treated as Employees?.) To be an “employee”, do you also have to be a legal entity?

PS there’s also a clause in the amendments that expresses neo-Luddite tendencies with which I am not totally unsympathetic:

“whereas robotics and AI that can perform similar tasks to those performed by humans should be used mainly to support and boost the abilities of man, as opposed to trying to replace the human element completely”

I haven’t read all the amendments yet – there may be more nuggets in there…

Convention Based Used URLs Support Automation

I spent a chunk of last week at Curriculum Development Hackathon for a Data Carpentry workshop on Reproducible Research using Jupyter Notebooks (I’d like to thank the organisers for the travel support). One of the planned curriculum areas looked at data project organisation, another on automation. Poking around on an NHS data publication webpage for a particular statistical work area just now suggests an example of how the two inter-relate… and how creating inconsistent URLs or filenames makes automatically downloading similar files a bit of a faff when it could be so easy…

To begin with, the A&E Attendances and Emergency Admissions statistical work area has URL:


The crumb trail in the on-page navigation has the form:

Home -> Statistics -> Statistical Work Areas -> A&E Attendances and Emergency Admissions

which we might note jars somewhat with the slug ae-waiting-times-and-activity, and perhaps reflects some sort of historical legacy in how the data was treated previously…

Monthly data is collected on separate financial year related pages linked from that page:



The breadcrumb for these pages has the form:

Home -> Statistics -> Statistical Work Areas -> A&E Attendances and Emergency Admissions -> A&E Attendances and Emergency Admissions 20MM-NN

A few of observations about those financial year related page URLs. Firstly, the path is rooted on the parent page (a Good Thing), but the slug looks mangled together from what looks like a more reasonable parent path (statistical-work-areasae-waiting-times-and-activity; this looks as if it’s been collapsed from statistical-work-areas/ae-waiting-times-and-activity).

The next part of the URL specifies the path to the A & E Attendances and Emergency Admissions page for a particular year, with an appropriate slug for the name – ae-attendances-and-emergency-admissions- but differently formed elements for the years: 2016-17 compared to 2015-16-monthly-3.

(Note that the 2015-16 monthly listing is incomplete and starts in June 2015.)

If we look at URLs for some of the monthly 2016-17 Excel data file downloads, we see inconsistency in the filenames:


(Note that CSV data seems only to be available for the latest (November 2016) data set. I don’t know if this means that the CSV data link only appears for the current month, or data in the CSV format only started to be published in November 2016.)

For the previous year we get:


and so on.

Inspection of these URLs suggests:

  1. the data is being uploaded to and published from a WordPress site (wp-content/uploads);
  2. the path to the data directory for the annual collection is minted according to the month in which the first dataset of the year is uploaded (data takes a month or two to be uploaded, so presumably the April 2016 data was posted in June, 2016 (2016/06); the 2015 data started late – the first month (June 2015) presumably being uploaded in August of that year (2015/08);
  3. the month slug for the data file starts off fine, being of the form MONTH-YEAR-AE-by-provider-, but then breaks things by having some sort of code value that perhaps uniquely identifies the version of the file;
  4. the month slug may be further broken by the addition of a revision element (eg -Revised-11082016).

If the URLs all had a common pattern, it would be easy enough to automate their generation from a slug pattern and month/year combination, and then automatically download them. (I haven’t yet explored inside each spreadsheet to see what inconsistency errors/horrors make it non-trivial to try to combine the monthly data into a single historical data set…)

As it is, to automate the download of the files requires scraping the web pages for the links, or manually retrieving them. (At least the link text on the pages seems to be reasonably consistent!)

Wrangling Time Periods (such as Financial Year Quarters) In Pandas

Looking at some NHS 111 and A&E data today, the reported data I was interested in was being reported for different sorts of period, specifically, months and quarters.

The pandas python library has quite a few tools for dealing with periods, so here are a couple of examples of tricks I put to use today.

Firstly, casting months to a month period. Monthly periods (in column df['Periodname']) were reported in the form “Dec-10”, “Jan-11”, etc, which is to say a three letter month followed by a two digit year representation. If we create a datetime on the first of the corresponding month, we can then cast that date to a month period for the corresponding month:

#Generate a pandas period for time series indexing purposes
#Create a date from each month (add the first date of the month) then set to period
#Note: if the datetime is an index, drop the .dt.
df['_period']=pd.to_datetime(('01-'+df['Periodname']), \

If the datetimes form the dataframe index, we could drop the .dt. element and cast the timeseries index directly to the monthly period. The %b term maps onto the locale specific month description.

The second example had financial years and dates in two columns – Year and Quarter. The months specified in the quarter description mix three letter representations and the fully written name of the month. (As with many NHS datasets, the text used to describe dates can be all over the place, even within a single spreadsheet or datafile!)

If we grab the first three letters of the month, we can identify the month as it’s short form. If we create a date stamp for the first day of the quarter (in the UK, corporate financial years run 1-April, to 31-March (it’s the personal tax year that reports from April 5th?) we can then cast to a quarterly reporting period whose year ends in a particular month, eg March (freq='Q-MAR').


Note that the YearQuarter uses the calendar year in which the financial year ends. We can check the start and end date of the quarters explicitly:


One issue when it comes to plotting the financial year quarterly periods is that the chart looks to me as if everything is a year out:


That is, I read the year as the calendar year, but it’s actually the calendar year in which a particular financial year ends.

So I’m wondering – should pandas really report and plot the financial year in which the fiscal year starts? Is there a switch for this? Or do I need to submit a bug report (and if so, what’s the bug?)

PS for a notebook worked example, see here: NHS 111 example.

An Alternative Way of Motivating the Use of Functions?

At the end of the first of the Curriculum Development Hackathon on Reproducible Research using Jupyter Notebooks held at BIDS in Berkeley, yesterday, discussion turned on whether we should include a short how-to on the use of interactive IPython widgets to support exploratory data analysis. This would provide workshop participants with an example of how to rapidly prototype a simple exploratory data analysis application such as an interactive chart, enabling them to explore a range of parameter values associated with the data being plotted in a convenient way.

In summarising how the ipywidgets interact() function works, Fernando Perez made a comment that made wonder whether we could use the idea of creating simple interactive chart explorers as a way of motivating the use of functions.

More specifically, interact() takes a function name and the set of parameters passed into that function and creates a set of appropriate widgets for setting the parameters associated with the function. Changing the widget setting runs the function with the currently selected values of the parameters. If the function returns a chart object, then the function essentially defines an interactive chart explorer application.

So one reason for creating a function is that you may be able to automatically convert into an interactive application using interact().

Here’s a quick first sketch notebook that tries to set up a motivating example: An Alternative Way of Motivating Functions?

PS to embed an image of a rendered widget in the notebook, select the Save notebook with snapshots option from the Widgets menu:


See also: Simple Interactive View Controls for pandas DataFrames Using IPython Widgets in Jupyter Notebooks

Creating a Self-Updating WordPress Post Using WordPress Transients and Data from a Third Party API

In the post Pondering a Jupyter Notebooks to WordPress Publishing Pattern: MultiMarker Map Widget, I described a simple pattern I started exploring last year that used a custom WordPress shortcode plugin to render data added to one or more custom fields associated with a WordPress post; the post text (including shortcode) and custom fields data were themselves posted into WordPress using some Python code executed from a Jupyter notebook. The idea behind that pattern was to provide a way of automating the creation of custom posts largely from a supplied data set, rendered using a generic shortcode plugin.

Another pattern I explored last year used the WordPress Transients API to cache data pulled from a 3rd party API in the WordPress database, and allow that data to be used by a custom plugin to render the post.

Here’s some example code for a plugin that renders a map containing recent planning applications on the Isle of Wight: the data is grabbed via an API from a morph.io webscraper, which scrapes the data from the Isle of Wight council website.

The two key bits of the script are where I check to see if cached data exisits ( get_transient( 'iwcurrplanningitems' ); and if it doesn’t, grab a recent copy from the API and cache it for 8 hours (set_transient('iwcurrplanningitems', $markers, 60*60*8);).

Plugin Name: IWPlanningLeafletMap
Description: Shortcode to render an interactive map displaying clustered markers. Markers are pulled in via JSON from an external URL. Intended primarily to supported automated post creation. Inspired by folium python library and Google Maps v3 Shortcode multiple Markers WordPress plugin
Version: 1.0
Author: Tony Hirst

//Loaded in from multimarker shortcode
add_action( 'wp_enqueue_scripts', 'custom_scripts' );
add_action( 'wp_enqueue_scripts', 'custom_styles' );

// Add stuff to header
add_action('wp_head', 'IWPlanningLeafletMap_header');
add_action('wp_head', 'fix_css');

function fix_css() { 
	echo '<style type="text/css">#map {
      }</style>' . "\n";

function IWPlanningLeafletMap_header() {

function IWPlanningLeafletMap_call($attr) {
// Generate the map template

	// Default attributes - can be overwritten from shortcode
	$attr = shortcode_atts(array(	
									'lat'   => '50.675', 
									'lon'    => '-1.32',
									'id' => 'iwmap_1',
									'zoom' => '11',
									'width' => '800',
									'height' => '500',
									), $attr);

	$html = '<div class="folium-map" id="'.$attr['id'].'" style="width: '. $attr['width'] .'px; height: '. $attr['height'] .'px"></div>

   <script type="text/javascript">
      var base_tile = L.tileLayer("https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png", {
          maxZoom: 18,
          minZoom: 1,
          attribution: "Map data (c) OpenStreetMap contributors - http://openstreetmap.org"

      var baseLayer = {
        "Base Layer": base_tile

      list of layers to be added
      var layer_list = {

      Bounding box.
      var southWest = L.latLng(-90, -180),
          northEast = L.latLng(90, 180),
          bounds = L.latLngBounds(southWest, northEast);

      Creates the map and adds the selected layers
      var map = L.map("'.$attr['id'].'", {
                                       center:['.$attr['lat'].', '.$attr['lon'].'],
                                       zoom: '.$attr['zoom'].',
                                       maxBounds: bounds,
                                       layers: [base_tile]

      L.control.layers(baseLayer, layer_list).addTo(map);

      //cluster group
      var clusteredmarkers = L.markerClusterGroup();
      //section for adding clustered markers
	$markers =  get_transient( 'iwcurrplanningitems' );
	if ( false === $markers ) {
		$json = file_get_contents($url);
		$markers=json_decode($json, true);
		set_transient('iwcurrplanningitems', $markers, 60*60*8);
	for ($i = 0;$i < count($markers);$i ++){
		$arrkeys=['Agent or Applicant','Location','Proposal'];
		foreach($arrkeys as $arrkey){
			$markers[$i][$arrkey] = str_replace("\n", "<br/>", $markers[$i][$arrkey]);
			$markers[$i][$arrkey] = str_replace("\r", "<br/>", $markers[$i][$arrkey]);
		$html .='
			var marker_'.$i.'_icon = L.AwesomeMarkers.icon({ icon: "info-sign",markerColor: "blue",prefix: "glyphicon",extraClasses: "fa-rotate-0"});
      		var marker_'.$i.' = L.marker(['.$markers[$i]['lat'].','.$markers[$i]['lon'].'], {"icon":marker_'.$i.'_icon});
      marker_'.$i.'.bindPopup("<strong>Consultation start:</strong> '.$markers[$i]['Consultation Start Date'].'<br/><strong>Consultation end:</strong> '.$markers[$i]['Consultation End Date'].'<br/><strong>Location:</strong> '.$markers[$i]['Location'].'<br/><em> '.$markers[$i]['Parish'].' parish, '.$markers[$i]['Ward'].' ward.</em><br/><strong>Proposal:</strong> '.$markers[$i]['Proposal'].'<br/><strong>Agent or Applicant:</strong> '.$markers[$i]['Agent or Applicant'].'<br/><strong>Case Officer:</strong> '.$markers[$i]['Case Officer'].'<br/><em><a href=\'https://www.iwight.com/planning/'.$markers[$i]['stub'].'\'>View application</a></em>");
      marker_'.$i.'._popup.options.maxWidth = 300;
     		//add the clustered markers to the group anyway

	$html .= '</script>';
	return $html;

add_shortcode('IWPlanningLeafletMap', 'IWPlanningLeafletMap_call');

One thing I started to wonder over the Christmas break was whether this approach could provide a way of sharing “data2text” content. For example, having a plugin that creates a canned summary of jobseeker’s allowance figures from data cached from the ONS website? A downside of this is that I’d have to write the data2text script using PHP, which means I couldn’t directly build on related code I’ve written previously…

I also wonder if we could use custom fields to permanently store data for a particular post. For example, we might check whether or not a custom field exists for the post, and if it doesn’t we could create and populate it using data pulled from an API, (possibly keyed by plugin/shortcode parameters, or the post publication date), using a WordPress add_post_meta() function call?

Google’s Appetite for Training Data

A placeholder post – I’ll try to remember to add to this as and when I see examples of Google explicitly soliciting training data from users that can be used to train its AI models…

Locations – “Popular Times”


For example: Google Tracking How Busy Places are by Looking at Location Histories [SEO by the Sea] which also refers to a patent describing the following geo-intelligence technique: latency analysis.

A latency analysis system determines a latency period, such as a wait time, at a user destination. To determine the latency period, the latency analysis system receives location history from multiple user devices. With the location histories, the latency analysis system identifies points-of-interest that users have visited and determines the amount of time the user devices were at a point-of-interest. For example, the latency analysis system determines when a user device entered and exited a point-of-interest. Based on the elapsed time between entry and exit, the latency analysis system determines how long the user device was inside the point-of-interest.