Lib Dems in Government have allocated £300,000 to fund the M20 Junctions 6 to 7 improvement, Maidstone, helping to reduce journey times and create 10,400 new jobs. Really? 10,400 new jobs?
In Critiquing Data Stories: Working LibDems Job Creation Data Map with OpenRefine I had a little poke around some of the data that was used to power a map on a Lib Dems’ website, A Million Jobs:
Liberal Democrats have helped businesses create over 1 million new private sector jobs. Click on the map below to find out what we’ve done where you live.
And then there was the map…
One thing we might take away from this as an assumption is that the markers correspond to locations or environs where jobs were created, and that by adding up the number of jobs created at those locations, we would get to a number over a million.
Whilst I was poking through the data that powers the map, I started to think this might be an unwarranted assumption. I also started to wonder about how the “a million jobs” figure was actually calculated?
Using a recipe described in the Critiquing Data Stories post, I pulled out marker descriptions containing the phrase “helping to reduce journey” along with the number of jobs created (?!) associated with those claims, where a number was specified.
Claims were along the lines of:
Summary: Lib Dems in Government have allocated £2,600,000 to fund the A38 Markeaton improvements , helping to reduce journey times and create 12,300 new jobs. The project will also help build 3,300 new homes.
Note that as well as claims about jobs, we can also pull out claims about homes.
If we use OpenRefine’s Custom Tabular Exporter to upload the data to a Google spreadsheet (here) we can use the Google Spreadsheet-as-a-database query tool (as described in Asking Questions of Data – Garment Factories Data Expedition) to sum the total number of jobs “created” by road improvements (from the OpenRefine treatment, I had observed the rows were all distinct – the count of each text facet was 1).
The sum of jobs “created”? 468, 184. A corresponding sum for the number of homes gives 203,976.
Looking at the refrain through the descriptions, we also notice that the claim is along the lines of: “Lib Dems in Government have allocated £X to fund [road improvement] helping to reduce journey times and create Y new jobs. The project will also help build Z new homes.” Has allocated. So it’s not been spent yet? [T]o create X new jobs. So they haven’t been created yet? And if those jobs are the result of other schemes made possible by road improvements, numbers will be double counted? [W]ill also help build So the home haven’t been built yet, but may well be being claimed as achievements elsewhere?
Note that the numbers I calculated are lower bounds, based on scheme descriptions that contained the specified search phrase and (“helping to reduce journey”) and a job numbers specified according to the pattern detected by the following Jython regular expression:
import re
tmp=value
tmp=re.sub(r'.* creat(e|ing) ([0-9,\.]*) new jobs.*',r'\2',tmp)
if value==tmp:tmp=''
tmp=tmp.replace(',','')
return tmp
In addition, the housing numbers were extracted only from rows where a number of jobs was identified by that regular expression, and where they were described in a way that could be extracted using the following the Jython regular expression re.sub(r'.* The project will also help build ([0-9,\.]*) new homes.*',r'\1',tmp)
PS I’m reading The Smartest Guys in the Room at the moment, learning about the double counting and accounting creativity employed by Enron, and how confusing publicly reported figures often went unchallenged…
It also makes me wonder about phrases like “up to” providing numbers that are then used when calculating totals?
So there’s another phrase to look for, maybe? have agreed a new ‘City Deal’ with …
Interesting as always.
I think the words ‘support the creation of’ would be more appropriate than ‘create’, in the sense that without the improvements you wouldn’t be able to have the new houses/employment sites.
Also in general these are often very long term (e.g. in local planning documents for the next ~15 years).
Finally, the money allocated by Govt is only part of the money required to create those jobs. The remainder will be made up by local councils and/or contributions from private developers. And yes, it wouldn’t surprise me if in some cases there was double counting.
Have been playing with some of their sources/data & assumptions over the last couple of days…
interestingly, the Lib Dem press office claimed the source of their data to be ONS and the Treasury, but a good chunk of it looks more like it comes from Highways Agencies press releases, and that throws open a bigger problem with it. If fixing a road junction even contributes to a number of jobs happening near that junction, arguably it doesn’t mean a job has been created at that location – it could just be a job that would have been somewhere else happens there instead.
In other words: spending seven million on the M6 in Cheshire isn’t going to make A Businessperson go ‘it is slightly easier to get to Sandbach so I shall take on some people in my generic business’. It *might* mean that the business decides to put jobs it was creating anyway near that junction, rather than, say, in London, but that’s a job relocated not created. At a stretch, if the job would otherwise have been outside the UK, the Lib Dems could claim a success, but not otherwise.
There are around 10 markers on the map in Northampton which have exactly the same claim in them.
Wonderfully, if you zoomed out, there was a map pin in Africa – they seem to have fixed that, but without thanking me for pointing it out…
@simon I’m not sure the text around the map actually made claims about how the things described on it related to the million number? Nor were any sources given for the data from what I can tell (though as you hint, maybe departmental press releases are one source!) As I started to discover, the text differed in claims, from reporting the actual creation of jobs to claims that were more ambiguous, such as “contributed to the potential creation of up to a certain number of jobs” (I’m not sure there was one that bad, but you get the point?;-)
There is presumably a policy document, or an appendix to a report somewhere, that describes how many jobs/houses can be claimed for a given road improvement?