Mental Health and the Web – Hounded by Ads, Bullied by Retargeters?

Some thinkses not thought through…

– can web ads feed paranoia, particularly retargeted ads that follow you round the web?
– can personalised ads be upsetting, or appear bullying? For example, suppose you’re critically ill, or in debt, or you’ve been researching a sensitive personal or family matter, and then the ads start to hound you and remind you about it…

Is one take on privacy that it is not to be reminded about something by an external agency? (Is there a more memorable quote along those lines somewhere..?!)

Are personalised ads one of the more pernicious examples of a “filter bubble” effect?

I think I need a proper holiday, well away from the web!

Time for Chaff as Google Analytics Adds Demographic and Interest Based Segmentation?

Via @mhawksey RTing @R3beccaF (I missed Rebecca’s tweet first time round), I notice that “Google Analytics can now segment visitors by age, gender and interests”, as described here: Getting Excited about Google Analytics’ Upcoming Features. The supported dimensions – age, gender and interest – allow you to get some idea about the demographics of your site visitors and segment stats on the same (though I wonder about sampling errors, how the demographic data is associated with user cookies etc?) Note also that demographics stats have previously been available in other Google products, such as Youtube and (via Karen Blakeman), Blogger, and demographic targeting of ads has been around for some time, of course…

Previously, to get demographic data into Google Analytics, I think you had to push it there yourself via custom variables (eg example; see also some of these sneaky tricks (I quite liked the idea of finessing the acquisition of user demographics data by capturing responses to ads placed via demographic targeting tools…!;-)

In passing, I just wonder about this phrase from the Google Analytics terms of service (my emphasis): You will not (and will not allow any third party to) use the Service to track, collect or upload any data that personally identifies an individual (such as a name, email address or billing information), or other data which can be reasonably linked to such information by Google.

So does this mean Google is free to try to learn from and link to whatever it thinks it can from your custom variable data, for example?

In any case, this all seems in keeping with Google’s aim to do everyone’s tracking on their behalf

Note to self: get up to speed on cohorts (90 days history only? This section in this post on unified segments suggests at least 6 months history?).

Note to self, 2: how could we go about obfuscating the data collected from us? I wonder about how we might go about creating digital/browser chaff? For example, running a background process that visits random websites and runs random searches under the guise of my Google account…?

I should probably tag this under: targeting countermeasures.

What did you notice for the first time today?

A week late on posting this, catching up with Brian’s notes on the ILI 2013: Future Technologies and Their Applications Workshop workshop we ran last week, and his follow up – What Have You Noticed Recently? – inspired by not properly paying attention to what I had to say, here are few of my own reflections on what I heard myself saying at the event, along with additional (minor) comments around the set of ‘resource’ slides I’d prepped for the event, though I didn’t refer to many of them…

  • slides 2-6 – some thoughts on getting your eye into some tech trends: OU Innovating Pedagogy reports (2012, 2013), possible data-sources and reports;
  • slides 6-11 – what can we learn from Google Trends and related tools? A big thing: the importance of segmenting your stats; means are often meaningless. The Mothers’ Day example demonstrates two signal causes (in different territories – i.e. different segments) for the compound flowers trend. The Google Correlate example show how one signal may lead – or lag – another. So the question: do you segment your library data? Do you look for leading or lagging indicators?
  • slides 12-18 – what role should/does/could the library play in developing the reputation of the organisation’s knowledge producers/knowledge outputs, not least as a way of making them more discoverable; this builds on the question of whose role it is to facilitate access to knowledge (along with the question: facilitate access for whom?)? – my take is this fits in the role librarians often take of organising an institution’s knowledge.
  • slides 19-27 – what is a library for? Supporting discovery (of what, by whom)? (Helping others) organise knowledge, and gain access to information? Do research?
  • slides 28-30 – the main focus of my own presentation during the main ILI2013 conference (I’ll post the slides/brief commentary in another post): if the information we want to discover is buried in data, who’s there to help us extract or discover the information from within the data?
  • slides 31-32 – sometimes reframing your perception of an organisation’s offerings can help you rethink the proposition, and sometimes using an analogy helps you switch into that frame of mind. So if energy utilities provide “warm house” and “clean, dry clothes” service, rather than gas or electricity, what shift might libraries adopt?
  • slides 33-39 – a few idle idea prompts around the question of just what is it that libraries do, what services do they provide?
  • slide 40 – one of the items from this slide caused a nightmare tangent! The riff started with a trivial observation – a telling off I received for trying to use the phone on my camera to take a photo of a sign saying “no cameras in the library”, with a photocopier as a backdrop (original story). The purpose of this story was two-fold: 1) to get folk into the idea of spotting anachronisms or situations where one technology is acceptable where an equivalent or alternative is not (and then wonder why/what fun can be had around that thought;-); 2) to get folk into wondering how users might appropriate technology they have to hand to make their lives easier, even if it “goes against the rules”.
  • slide 41 – a thought experiment that I still have high hopes for in the right workshop setting…! if you overheard someone answer a question you didn’t hear with the phrase “did you try the library?”, what might the question be? You can then also pivot the question to identify possible competitors; for example, if a sensible answer to the same question is “did you try Amazon?”, Amazon might be a competitor for the delivery of that service.
  • slide 42 – this can lead on from the previous slide, either directly (replace “library” with “Amazon” or “Google”), or as way of generating ideas about how else a service might be delivered.

Slide not there – a riff on the question of: what did you notice for the first time today? This can be important for trend spotting – it may signify that something is becoming mainstream that you hadn’t appreciated before. To illustrate, I’ve started trying to capture the first time I spot tech in the wild with a photo, such as this one of an Amazon locker in a Co-Op in Cambridge, or a noticing from the first time I saw video screens on the Underground.

As with many idea generating techniques, things can be combined. For example, having introduced the notion of Amazon lockers, we might then ask: so what use might libraries make of such a system, or thing? Or if such things become commonplace, how might this affect or influence the expectations of our users??

Google’s New Terms Mean You Could Soon Be Acting as a Product Endorser

If you’re a Google account holder, you may have noticed an announcement recently that Google has changed its terms and conditions, in part to allow it to use your +1s and comments as “shared endorsements” in ads published through Google ad services.

sharedEndorsementexamples

So it seems as if there’s now at least two ways Google uses you, me, us, to generate revenue in an advertising context. Firstly, we’re sold as “audience” within a particular segment: “35-50 males into tech”, for example, and audience that advertisers can buy access to. This may even get to the level of individual targeting (for example, Centralising User Tracking on the Web – Let Google Track Everyone For You). Now, secondly, as personal endorsers of a particular company, service or product.

The ‘recent changes’ announcement URL looks like a general “change notice” URL – https://www.google.co.uk/intl/en/policies/terms/changes/ – so I’ll repost key elements from the announcement here….

“Because many of [us] are allergic to legalese”, announcement goes, “here’s a plain English summary for [our] convenience.”

We’ve made three changes:

Firstly, clarifying how your Profile name and photo might appear in Google products (including in reviews, advertising and other commercial contexts).

You can control whether your image and name appear in ads via the Shared Endorsements setting.

Secondly, a reminder to use your mobile devices safely.
Thirdly, details on the importance of keeping your password confidential.

The first change – how my Profile name and photo might appear in Google products – is the one I’m interested in.

How your Profile name and photo may appear (including in reviews and advertising)

We want to give you, and your friends and connections, the most useful information. Recommendations from people that you know can really help. So your friends, family and others may see your Profile name and photo, and content like the reviews that you share or the ads that you +1’d. This only happens when you take an action (things like +1’ing, commenting or following) – and the only people who see it are the people that you’ve chosen to share that content with. On Google, you’re in control of what you share. This update to our Terms of Service doesn’t change in any way who you’ve shared things with in the past or your ability to control who you want to share things with in the future.

Feedback from people you know can save you time and improve results for you and your friends across all Google services, including Search, Maps, Play and in advertising. For example, your friends might see that you rated an album 4 stars on the band’s Google Play page. And the +1 you gave your favourite local bakery could be included in an ad that the bakery runs through Google. We call these recommendations shared endorsements and you can learn more about them here.

When it comes to shared endorsements in ads, you can control the use of your Profile name and photo via the Shared Endorsements setting.

Here’s a direct link to the setting… [if you have a Google+ account, I suggest you go there, uncheck the box, and hit “Save”]. I never knowingly checked this – so presumably the default is set to checked (that is, with me opted in to the “service”?

I never knowingly checked this - so presumably the defualt is "checked"?

If you turn the setting to “off,” …

you’ll get hassled:

F**k you, google...

or to put it another way,

…your Profile name and photo will not show up on that ad for your favourite bakery or any other ads. This setting only applies to use in ads, and doesn’t change whether your Profile name or photo may be used in other places such as Google Play.

I have no idea what the context of Google Play might mean. I do have an Google Android phone, and it is tied to a Google account. It is largely a mystery to me, particularly when it comes to knowing who has access to – or has taken copies of – my contacts. I have no idea what Google Play services I have or have not been opted in to.

If you previously told Google that you did not want your +1’s to appear in ads, then of course we’ll continue to respect that choice as a part of this updated setting.

I’m not sure what that means? If I’ve checked “do not want my +1’s to appear in ads” box, will the current setting be set to unchecked (opt out of shared endorsements)? Does the original setting still exist somewhere, or has it been replaced by the new setting? Or is there another level of privacy setting somewhere, and if so how do the various levels interact?

This is on my current Google+ settings page:

shared endorsements

and I can’t see anything about +1 ad opt outs, so presumably the setting has changed? I’d have thought I’d have opted out of allowing +1s to appears in ads (had I known: a) that +1s may have been used in ads; and b) that such a setting existed), but presumably that fact passed me by (more on this later in the post…) Or I had opted out and the opt-out wasn’t respected? But surely not that…?

For users under 18, their actions won’t appear in shared endorsements in ads and certain other contexts.

Which is to say, ‘if you lied about your age in order to access to particular services, we’re gonna sell the ability for advertisers to use you to endorse their products to your friends’.

So that’s the “helpful” explanation of the terms.. what do the actual terms say?

When you upload or otherwise submit content to our Services, you give Google (and those we work with) a worldwide licence to use, host, store, reproduce, modify, create derivative works (such as those resulting from translations, adaptations or other changes that we make so that your content works better with our Services), communicate, publish, publicly perform, publicly display and distribute such content. The rights that you grant in this licence are for the limited purpose of operating, promoting and improving our Services, and to develop new ones. This licence continues even if you stop using our Services (for example, for a business listing that you have added to Google Maps). Some Services may offer you ways to access and remove content that has been provided to that Service. Also, in some of our Services, there are terms or settings that narrow the scope of our use of the content submitted in those Services. Make sure that you have the necessary rights to grant us this licence for any content you submit to our Services. [This para, or one very much like it, is in the current terms.]

If you have a Google Account, we may display your Profile name, Profile photo and actions you take on Google or on third-party applications connected to your Google Account (such as +1’s, reviews you write and comments you post) in our Services, including displaying in ads and other commercial contexts. We will respect the choices you make to limit sharing or visibility settings in your Google Account. For example, you can choose your settings so that your name and photo do not appear in an ad.

Hmmm.. so maybe the settings do – or will – have a finer level of control (and complexity…) associated with them? I wonder also whether those two paragraphs can work together? If I comment on a Google+ page, or maybe tag a brand or product in an image I have uploaded, could Google create a derivative work as part of a shared endorsement by me?

Looking Around Some Other Google+ Settings

Finding myself on my Google+ settings page, I had a look at some of the other settings…

Be wary of implict reveals?

Hmm… this could be an issue, if checked? If things are shared to people in my circles, and folk get automatically added to my circles if I just search for them, then, erm, I could maybe unwaringly opt a page in to my circles?

circle shares

But if I do search for someone and they’re added to my circles on my behalf, what circle are they added to?

so which do they get added to?

Not being paranoid or anything, but I can now also imagine something like the following setting appearing on my main Google account insofar as it relates to search, for example:

Google Search Pages
_ Automatically add a Google+ Author to my circles if I click through on a search result marked with a Google+ Author tag.

So what other settings are there that may be of interest?

Several to do with automatically tampering with my content (as if false memory syndromes aren’t bad enough!)

mess with my stuff...

do stuff to my stuff

I seem to remember these being announced, but didn’t think to check that I would automatically be opted in.

Note to self: When Google announces a new Google+ service, or service related to Google accounts, assume I get automatically opted in.

Any others? Ah, ha… a little something that invisibly enmeshes me a little deeper in the Google knowledge web:

link me in to the Google knowledge graph

Here’s the blurb, rather bluntly entitled Find My Face: “Find my face makes finding pictures and videos of you easy and more social. Find my face offers name tag suggestions so you, or people that you know, can quickly tag photos. Any time someone tags you in a photo or video, you’ll be able to accept or reject name tags created by people you know.”

So I’m guessing if I opt in to this, if Google recognises that I’m in a photo, and someone I know views that photo, they’ll be prompted to tag me in it. I wonder if Google actually has a belief graph and a knowledge graph? In the first case, the belief graph would associate me with photos Google’s algorithms think I’m in. In the second case, the knowledge graph, Google would associate me with photos where someone confirms that I am in the photo. If you want to get geeky, this knowledge vs. belief distinction, where knowledge means “justified true belief”, has a basis in things like epistemic logic (which I came across in the context of agent logics) – I’d never really thought about Google’s graph in this way… Hmmm…

Here’s how it works, apparently:

After you turn on Find my Face, Google+ uses the photos or videos you’re tagged in to create a model of your face. The model updates as tags of you are added or removed and you can delete the entire face model at any time by turning off Find my Face.

If you turn on Find my Face, we can use your face model to make it easier to find photos or videos of you. For example, we’ll show a suggestion to tag you when you or someone you know looks at a photo or video that matches your face model. Name tag suggestions by themselves do not change the sharing setting of photos or albums or videos. However, when someone approves the suggestion to add a name tag, the photo and relevant album or video are shared with the person tagged.

So can Google sell that face model of me to other parties? Or just sell recognition of my face in photos and videos as a service, or as part of an audience construction process?

I guess at least I get to approve any photo tags though… Or do I?

Act on my behalf

So if I search for someone on Google+, they’re added to my circles, which means that if they tag me in a photo when prompted by Google+ to do so, their tag is automatically accepted by me by virtue of this proxy setting I seem to have been automatically opted in to? Or am I reading these settings all wrong?

Ho hum, I guess it’s not even the legalese I’m allergic to… it’s understanding the emergent complexity and consequences that arise from different combinations of settings on personal account pages…

ScreenScraping HTML Web Pages With OpenRefine – Norwegian Oil Company Data

[An old post, rescued from the list of previously unpublished posts…]

Although I use OpenRefine from time time, one thing I don’t tend to use it for is screenscraping HTML web pages – I tend to write Python scripts in Scraperwiki to do this. Writing code is not for everyone, however, so I’ve brushed off my searches of the OpenRefine help pages to come up with this recipe for hacking around with various flavours of company data.

The setting actually comes from OpenOil’s Johnny West:

1) given the companies in a particular spreadsheet… for example “Bayerngas Norge AS” (row 6)
2) plug them into the Norwegian govt’s company registry — http://www.brreg.no/ (second search box down nav bar on the left) – this gives us corporate identifier… so for example… 989490168
3) plug that into purehelp.no — so http://www.purehelp.no/company/details/bayerngasnorgeas/989490168
4) the Aksjonærer at the bottom (the shareholders that hold that company) – their percentages
5) searching OpenCorporates.com with those names to get their corporate identifiers and home jurisdictions
6) mapping that back to the spreadsheet in some way… so for each of the companies with their EITI entry we get their parent companies and home jurisdictions

Let’s see how far we can get…

To start with, I had a look at the two corporate search sites Johnny mentioned. Hacking around with the URLs, there seemed to be a couple of possible simplifications:

– looking up company ID can be constructed around http://w2.brreg.no/enhet/sok/treffliste.jsp?navn=Bayerngas+Norge+AS – the link structure has changed since I originally wrote this post, correct form is now http://w2.brreg.no/enhet/sok/treffliste.jsp?navn=Bayerngas+Norge+AS&orgform=0&fylke=0&kommune=0&barebedr=false [h/t/ Larssen in the comments.]

http://www.purehelp.no/company/details/989490168 (without company name in URL) appears to work ok, so can get there from company number.

Loading the original spreadsheet data into OpenRefine gives us a spreadsheet that looks like this:

openRefine xls import

So that’s step 1…

We can run step 2 as follows* – create a new column from the company column:

* see the end of the post for an alternative way of obtaining company identifiers using the OpenCorporates reconciliation API…

openRefine add new col

Here’s how we construct the URL:

OpenRefine - get new col by URL

The HTML is a bit of a mess, but by Viewing Source on an example page, we can find a crib that leads us close to the data we require, specifically the fragment detalj.jsp?orgnr= in the URL of the first of the href attributes of the result links.

table to scrape - crib

Using that crib, we can pull out the company ID and the company name for the first result, constructing a name/id pair as follows:

[value.parseHtml().select("a[href^=detalj.jsp?orgnr=]")[0].htmlAttr("href").replace('detalj.jsp?orgnr=','').toString() , value.parseHtml().select("a[href^=detalj.jsp?orgnr=]")[0].htmlText() ].join('::')

The first part – value.parseHtml().select("a[href^=detalj.jsp?orgnr=]")[0].htmlAttr("href").replace('detalj.jsp?orgnr=','').toString() – pulls out the company ID from the first search result, extracting it from the URL fragment.

The second part – value.parseHtml().select("a[href^=detalj.jsp?orgnr=]")[0].htmlText() – pulls out the company name from the first search result.

We place these two parts into an array and then join them with two colons: [].join('::')

This keeps thing tidy and allows us to check by eye that sensible company names have been found from the original search strings.

open refine - compare names

We can now split the name/ID pair column into two separate columns:

openRefine spilt column into cols

And the result:

openrefne  cols now split

The next step, step 3, requires looking up the company IDs on purehelp. We’ve already see how a new column can be created from a source column by URL, so we just repeat that approach with a new URL pattern:

openrefine add another col by URL

(We could probably reduce the throttle time by an order of magnitude!)

The next step, step 4, is to pull out shareholders and their percentages.

The first step is to grab the shareholder table and each of the rows, which in the original looked like this:

shareholders table

The following hack seems to get us the rows we require:

[REMOVED]

BAH – crappy page sometimes has TWO companyOwnership IDs, when the company has shareholdings in other companies as well as when it has shareholders:-(

fckwt

So much for unique IDs… ****** ******* *** ***** (i.e. not happy:-(

Need to search into table where “Shareholders” is specified in top bar of the table, and I don’t know offhand how to do that using the GREL recipe I was taking because the HTML of the page is really horrible. Bah…. #ffs:-(

Question, in GREL, how do I get the rows in this not-a-table? I need to specify the companyOwnership id in the parent div, and check for the Shareholders text() value in the first child, then ideally miss the title row, then get all the shareholder companies (in this case, there’s just one; better example):

<div id="companyOwnership" class="box">
	<div class="boxHeader">Shareholders:</div>
	<div class="boxContent">
		<div class="row rowHeading">
			<label class="fl" style="width: 70%;">Company name:</label>
			<label class="fl" style="width: 30%;">Percentage share (%)</label>
			<div class="cb"></div>
		</div>
		<div class="row odd">
			<label class="fl" style="width: 70%;">Shell Exploration And Production Holdings</label>
			<div class="fr" style="width: 30%;">100.00%</div>
			<div class="cb"></div>
		</div>
	</div>

For now I’m going to take a risky shortcut and assume that the Shareholders (are there always shareholders?) are the last companyOwnership ID on the page:

forEach(value.parseHtml().select('div[id=companyOwnership]')[-1].select('div.row'),e,e).join('::')

openrefine last company ownership

We can then generate one row for each shareholder in OpenRefine:

open refine - spilt

(We’ll need to do some filling in later to cope with the gaps, but no need right now. We also picked up the table header, which has been given it’s own row, which we’ll have to cope with at some point. But again, no need right now.)

For some reason, I couldn’t parse the string for each row (it was late, I was confused!) so I hacked this piecemeal approach to try to take them by surprise…

value.replace(/\s/,' ').replace('<div class="row odd">','').replace('<div class="row even">','').replace('<form>','').replace('<label class="fl" style="width: 70%;">','').replace('<div class="cb"></div>','').replace('</form> </div>','').split('</label>').join('::')

horrible hack openrefine

Using the trick we previously applied to the combined name/ID column, we can split these into two separate columns, one for the shareholder and the other for their percentage holding (I used possibly misleading column names below – should say “Shareholder name”, for example, rather than shareholding 1?):

openrefine column split

We then need to tidy the two columns:

value.replace("<\/a>",'').replace(/.*>/,'')

Note that some of the shareholder companies have identifiers in the website we scraped the data from, and some don’t. We’re going to be wasteful and throw the info away that links the company if it’s there…

value.replace('<div class="fr" style="width: 30%;">','').replace('</div>','').strip()

We now need to do a bit more tidying – fill down on the empty columns in the shareholder company column and also in the original company name and ID [actually – this is not right, as we can see below for the line Altinex Oil Norway AS…? Maybe we can get away with it though?], and filter out the rows that were generated as headers (text facet then select out blank and Fimanavn).

This is what we get:

COmpany ownership

We can then export this file, before considering steps 5 and 6, using the custom exporter:

open refine exporter

Select the columns – including the check column of the name of the company we discovered by searching on the names given in the original spreadsheet… these are the names that the shareholders actually refer to…

column export

And then select the export format:

column export format

Here’s the file: shareholder data (one of the names at least appears not to have worked – Altinex Oil Norway AS). LOoking at the data, I think we also need to take the precaution of using .strip() on the shareholder names.

Here’s the OpenRefine project file to this point [note the broken link pattern for brreg noted at the top of the post and in the comments… The original link will be the one used in the OpenRefine project…]

Maybe export on a filtered version where Shareholding 1 is not empty. Also remove the percentage sign (%) in the shareholding 2 column? ALso note that Andre is “Other”… maybe replace this too?

In order to get the OpenCorporates identifiers, we should be able to just run company names through the OpenCorporates reconciliation service.

Hmmm.. I wonder – do we even have to go that far? From the Norwegian company number, is the OpenCorporates identifier just that number in the Norwegian namespace? So for BAYERNGAS NORGE AS, which has Norwegian company number 989490168, can we look it up directly on OpenCorporates as http://opencorporates.com/companies/no/989490168? It seems like we can…

This means we possibily have an alternative to step 2 – rather than picking up company numbers by searching into and scraping the Norwegian company register, we can reconcile names against the OpenCorporates reconciliation API and then pick up the company numbers from there?

Spending & Receipts Transparency as a Consequence of Accepting Public Money?

One of the things I’ve been pondering lately is the asymmetry that exists between the information disclosures that public bodies are obliged to make compared to private ones. My gut feeling is that the public bodies may be placed at a disadvantage by these obligations compared to the private companies, though I guess I need to find some specific examples of this. (Cost may be one; having to release data that can be used by competitors in a procurement exercise may be another; if you have any good examples, please post them in the comments…)

To start with, let’s see how the field is currently set in the area of “transparency” (at least, in a spending sense).

Central government spend over £25k
The obligation on NHS bodies to publish spending data on transactions over £25,000 appears to come via HM Treasury reporting guidance on Transparency – Publication of spend over £25,000 (9th September, 2010), which was released in support of a Prime Ministerial letter of 31st May 2010 to Secretaries of State:

2.1 Scope
2.1.1 This guidance applies to all parts of central government as defined by the Office for National Statistics, including departments, non‐ministerial departments, agencies, NDPBs, Trading Funds and NHS bodies. There are a limited number of exceptions to the requirement to publish. The Intelligence Agencies are completely exempt from this requirement. The following are also not subject to this requirement:
• Financial and non‐financial public corporations
• Parliamentary bodies
• Devolved Administrations
2.1.2 However it is recommended that these bodies adopt this guidance as best practice. Separate guidance is being prepared for local authorities.
2.1.3 Where an organisation comprises both a central government body and a public corporation (e.g. the BBC), this requirement applies to the part of the organisation that is classed as part of central government. The requirement does not apply to that part of the organisation that is a public corporation.

I’m not sure what power obliges these parts of government to conform to this guidance, or the requirement to publish spending data for transactions of £25,000?

A further letter – published on 7 July 2011 – added further obligations across a range of central government departments, as well as on the NHS.

(For information on Transparency in Procurement and Contracting see this supplier factsheet.)

Local authority spend over £500
By contrast, local authorities seem to be obliged to release spending data as a result of the publication of a Code of Recommended Practice for Local Authorities on Data Transparency (29 September 2011) (via) which was “issued by the Secretary of State for Communities and Local Government in exercise of his powers under section 2 of the Local Government, Planning and Land Act 1980 to issue a Code of Recommended Practice (The Code) as to the publication of information by local authorities about the discharge of their functions and other matters which he considers to be related”, where “local authority” means:

– a county council
– a district council
– a parish council which has gross annual income or expenditure (whichever
is the higher) of at least £200,000
– a London borough council
– the Common Council of the City of London in its capacity as a local authority
or police authority
– the Council of the Isles of Scilly
– a National Park authority for a National Park in England
– the Broads Authority 5
– the Greater London Authority so far as it exercises its functions through the
Mayor
– the London Fire and Emergency Planning Authority
– Transport for London
– the London Development Agency
– a fire and rescue authority (constituted by a scheme under section 2 of the
Fire and Rescue Services Act 2004 or a scheme to which section 4 of that
Act applies, and a metropolitan county fire and rescue authority)
– a police authority, meaning:
(a) a police authority established under section 3 of the Police Act 1996
(b) the Metropolitan Police Authority
– a joint authority established by Part IV of the Local Government Act 1985
(fire and rescue services and transport)
– joint waste authorities, i.e. an authority established for an area in England by
an order under section 207 of the Local Government and Public Involvement
in Health Act 2007
– an economic prosperity board established under section 88 of the Local
Democracy, Economic Development and Construction Act 2009
– a combined authority established under section 103 of that Act
– waste disposal authorities, i.e. an authority established under section 10 of
the Local Government Act 1985
– an Integrated Transport Authority for an integrated transport area in England

The policy area associated with releasing local spending data is this one: Policy – Making local councils more transparent and accountable to local people

UPDATE: an amendment to the Local Government, Planning and Land Act 1980 extends the code of practice relating to local government publication schemes to include “information about any expenditure incurred by authorities” and “information about any legally enforceable agreement entered into by authorities and any invitations to tender for such agreements”.

So What?

Writing in the Observer (“Open government? Don’t make me laugh”, September 29th, 2013), columnist Nick Cohen wrote:

Public services have always moved from daylight into darkness when private managers take them over. Ever since Labour passed the Freedom of Information Act in 2000, MPs, journalists, bloggers, academics, campaign groups and concerned citizens have been able to examine a prison, say, or medical service up to the moment of privatisation when the possibility of scrutiny vanished.

Sadiq Khan, Labour’s justice spokesman, grasped the need to extend freedom of information to cover the private recipients of public money…

As it is the job of parliament to hold the executive to account, Khan set a test for G4S. He asked for details of its restraint techniques. The company replied that it could not respond to freedom of information requests. The Ministry of Justice would, even though G4S trained the guards and knew what they did while the ministry did not,

The cloak of secrecy may soon be draped over the public sector as well. The Campaign for Freedom of Information is alarmed – to put it mildly – that ministers are talking about making it all too easy for civil servants to refuse to disclose information that the public needs to know – and once had a right to know.

The keenness with which the coalition is protecting commercial interests explains a ministerial manoeuvre that baffled me at the time. When libel reform came before parliament, I, along with everyone else, assumed that the private contractors moving into the NHS, prison service and just about every other service would not be allowed to sue their critics for libel. Under the far-from-liberal existing law, public authorities could not sue because in a democracy voters were free to speak their minds about the providers of public services even if what they said was not in the best possible taste. Indeed, as taxpayers and as the recipients of services, we had a dual justification for saying what we wanted without the threat that crushing financial penalties would bully us into silence.

In the new market-orientated order the coalition was so keen to embrace, any restriction on robust debate would be unfair as well as undemocratic. A failure to allow free speech would mean that businesses and charities could say what they liked about a local authority bidding for a service, but the local authority could not respond in kind for fear of a writ. The Conservatives would not give an inch. In the name of libel reform, they insisted that the freedom to argue in the public square must be restricted and gave private interests an exemption from criticism they denied to public services.

Whatever your feelings about public services and the extent to which private companies could or should be able to deliver them, it seems to me that the different transparency regulations are simply not fair. Public bodies receive public money, so you may argue that it is only right that they should disclose how they spend it. But when the spend with another organisation is so large that it is essentially devolving a significant part of a public body’s budget for spending by a private company, then that private body should be transparent about the way in which it further spends or otherwise allocates the money.

At the moment, transparency in the UK tries to make it possible for use to see who public bodies give money to. If one public body, A, spends with another public body, B, we can also derive information about the receipts that body B has from other public bodies, such as A. If A spends with private company C, we know that A has spent with C but not how C further disburses the funds. If C purchases a service from public body B (which could be a police authority, for example), we don’t know that C spent the public money with B, nor do we know (from the non-existent recipient column of C’s non-existent transparency spending data) that public body B received public money from public body A via private company C.

There is an asymmetry in the way we have sight of public spend when it passes through private hands.

The solution? How about we require private companies that obtain substantial receipts from public bodies (say, over £25k in a single payment) to account for their spend associated with contract in the matter of payments over £500. If the £25k/£500 limits are too onerous (government inflicting red tape on private enterprise, then how about £250k/£10k breakdown. As to the red tape: the public bodies have to dal with it and they are increasingly competing for the same pot of money with the private companies. Come on, chaps, play fair…

The Other Problem – Centralised Receipts

As well as the loss of oversight into how public money is being spent, there is a secondary problem when it comes to tracking the extent to which bodies both public and private receive public money: how do we find how much public money in total a particular private company has received?

On the one hand, we can easily generate reports that show the total amount of public money spent by public body A with private company C by looking at their spending data. (This may be complicated that different companies within a larger group (eg separate recipients C Services Ltd and C Operations Ltd may both be part of C Ltd) all receive separate payments from the body, but as we get more information about beneficial corporate interests we can start to piece together that piece of the jigsaw.)

On the other, to see all public money receipts by a particular company, we need to collate spend data from every public body and then aggregate all the spend with a particular company to get an idea of how much public money it has received, and in what spending areas, from the public sector. (For an early example of this, see Sketching Substantial Council Spending Flows to Serco Using OpenlyLocal Aggregated Spending Data.)

The solution? How about we require private corporate recipients to disclose all public money they have received as part of the deal associated with receiving that public money (a deal that public bodies have to accede to). Again, we may wish to put a threshold on this, even one that has some sort of symmetry associated with it compared to the spending requirements: say, disclosure of £500+ receipts from local government and £25k+ receipts from government departments and other associated bodies.

PS I note a recent Open Letter to the PM from UK civil society organisations that makes a related request:

2. ENABLE PUBLIC SCRUTINY OF ALL ORGANISATIONS IN RECEIPT OF PUBLIC MONEY,

by opening up public sector contracts and extending transparency standards and legislation. Endorse and implement a system of ‘Open Contracting’, ensuring public disclosure and monitoring of contracting from procurement to the close of projects, and amend the Freedom of Information Act so that all information held by a contractor in connection with a public service contract is brought within its scope.

Ah, yes… Information asymmetry around FOI requests. Do you know of any companies who operate parking meters on behalf of a local council? I’d like to see if I could get this sort of data from them…

PPS see also the House of Commons Public Administration Committee, who took evidence on Statistics and open data on October 8th, 2013.