OUseful.Info, the blog…

Trying to find useful things to do with emerging technologies in open education

Archive for the ‘Policy’ Category

Random Thoughts on Convergence, Plurality, Media and Education

with one comment

[A post from my "drafts" queue...] …aka scribbled notes, and a bit of a sleep-deprived brain dump…

[Context: this post is a set of fragmentary thoughts/clippings built up from concerns I still need to clearly articulate routed on my feeling (i.e. I need to properly "evidence" this...) that Pearson has significant control over a large part of the school education market in terms of assessment provision, curriculum interpretation/scheme of work development and material/textbook supply; and that it is aggressively pursuing a strategy to dominate parts of the education market at global scale. I also believe that we don't fully understand how the web is shaping and will continue to shape the way citizens (age six and up!) are informed about, educated about, learn about and make sense of the world, or the extent to which we might reimagine the distinctions between news, education and policy research and development (eg From Academic Privilege to Consultations as Peer Review). If you want a more concrete scenario to contextualise this post further, imagine News International buying Pearson, or vice versa. When things like "editorial control" are mentioned, also bear in mind things like the use of third party platforms for the delivery of informal educational content and training material (what happens if Pearson buys Coursera? When will we see the first course to be taken down from a MOOC platform because the platform provider disagrees with it, especially if it is posted by an NGO, for example (eg Red Cross launches a MOOC).]

A recent-ish post (I’ve been away from my feeds for a bit…) on the LSE Media Policy Project blog – Committee on Convergence Kicks Off with Big Policy Questions – frames a new House of Lords Select Committee on Communications investigation in terms of what exactly is meant by convergence alongside expectations about media regulation:

[A]ny inquiry into the public policy impact of convergence has to develop a definition of what convergence is. For a long time we have been discussing convergence of networks, of companies and of end user devices. But any debate on public policy must concern itself primarily with convergence of people, behaviours and expectations: of which groups of consumers are doing what, with what, and crucially what they expect to be regulated.

Research repeatedly shows that a high proportion of consumers expect television services and screens to be regulated, but fewer have the same expectations regarding mobile or internet services. The process of media convergence in recent years has seen increasing numbers of consumers accessing newspaper and video content through their mobile phones and laptops, platforms where consumers don’t expect regulation. The next wave of converged services is likely to be driven by the rise in the number of connected TV sets, along with improved speed and reliability of broadband. This is somewhat different because it is the appearance of non-regulated services on a screen – the television – where people still have a high expectation of regulation.

Broadly, my view is that the issues remain the old ones: child protection, media plurality, content regulation and regulatory reform. But we do still need to pose the questions afresh, as the Australian Convergence Commission has done. That report [Convergence Review Final Report, where convergence is convergence "of media content and communications technologies"] acknowledges a diversification of different ways of accessing content, but notes that much content is coming from the same, or fewer sources. So we do need new thinking on how to ensure plurality, and crucially transparency of media ownership, and as David Levy also pointed out we need a fundamental rethink of the role of powerful gatekeepers in opinion formation. Also in need of a revisit is the balance between different regulatory models, such as the old debate about public funding versus market-led industry growth. Is the protection of a public sphere of regulated, responsible, ethical public media a 20th century hangover, or is it something that we want to replicate for the new converged environment?

My memory of what constituted “news” way-back-when-I-were-a-lad, was the sort of thing I vaguely remember being reported in newspapers and via TV and radio news broadcasts: political stories, business and weather events, catastrophes, and so on… My everyday notion of “news” today is that what is often described as news is not so discrimatiory (if, indeed, it ever was), that it typically relates to the sharing of “novelty” or current events, shared through news organisations, social sharing, and the almost direct regurgitation of corporate and lobbiest group press releases via “news” churning aggregator sites.

To the extent that convergence relates to the of receipt of the same message through different (traditionally divergent) channels, notions of media ownership (which is to say, media plurality) are one particular area of concern. So for example, the fact that The Sun and The Times share a corporate parent may cause concern for a variety of reasons: that a naive consumer may think they are differently owned, and thereby “truly independent” of each other, for example. (In this case, the positioning of the two products may be that they are positioned to appeal to either independent segments, or to a segment aligned with the News International parental world view).

In a similar sense,we might also be wary of convergence in popular conceptions of the marketplace: the fact that Currys and PCWorld are one and the same, for example, and not independent electrical retailers that compete on price to the benefit of the consumer, is something that Currys/PCWorld try to manage through adverts that show the brands as siblings, but on the High Street (or rather, in the out-of-town retail park) it can be harder to tell that they aren’t independent; or that Heinz Pickles and Ross Pickles are independently produced). (Note there are two different models at play here: DRG and News International both own brands that serve different segments of the same market, although there may be some overlap. In news terms, The Times and The Sun are marketed to different social classes; in the electrical retail space, Currys and Dixons segment along product lines, although with some overlap (TVs, computers, communications devices. In the food space, Greencore produce products in particular facilities but often under license, as do Hügli (any others to track down? eg Goldenacres pet foods;). The convergence here is not one of brand ownership, but convergence, or centralisation, of production; a bit like the use of contract newspaper printers by independently owned and branded titles. In the food space, concentration of production may cause concerns when you start to think about independence of ingredient supply, or cross-contamination across apparently different food product lines. See also: generic products vs. branded products (same product, different badge); own-brand products produced by premium brand producers [need refs of good examples; is there a directory of equivalences anywhere? So for example, known brand producers who also produce similar lines under license as supermarket own brands?]. )

OfCom’s June 2012 report on Measuring media plurality, summarising the goals, definition and scope of plurality as follows:

• Plurality matters because it makes an important contribution to a well-functioning democratic society through informed citizens and preventing too much influence over the political process.
• We have defined plurality as a) ensuring there is a diversity of viewpoints available and consumed across and within media enterprises and b) preventing any one media owner or voice having too much influence over public opinion and the political agenda.
• Plurality needs to be considered both within organisations (i.e. internal plurality) and between organisations (i.e. external plurality).
• In terms of scope, a review of plurality should be limited to news and current affairs but these genres should be considered across television, radio, the press and online.

Picking those points apart a little more:

… plurality contributes to a well-functioning democratic society – through the means of:
i) Informed citizens – able to access and consume a wide range of viewpoints across a variety of platforms and media owners.
ii) Preventing too much influence over the political process – exercised by any one media owner.

Based on the public policy goals highlighted above, and consistent with precedent, we have defined plurality with reference to desired outcomes of a plural market:
• Ensuring there is a diversity of viewpoints available and consumed across and within media enterprises.
• Preventing any one media owner or voice having too much influence over public opinion and the political agenda.

We note that a diversity of viewpoints can be formed within an organisation and between organisations. Both are relevant to the question of plurality; throughout this report, we refer to these as internal and external plurality respectively – which we defined in our PIT report as set out below.
• External plurality: the range and number of persons having control of media enterprises in the context of their ability to influence opinions and control the agenda.
• Internal plurality: how far the range of views expressed within media enterprises may ensure sufficient plurality, including the effects of the impartiality rules for broadcast news, the culture of newsrooms and audience expectations.

…we remain of the view that news and current affairs play the primary role in delivering the public policy goals set out earlier.
We believe news and current affairs are the most relevant forms of content for the delivery of the public policy goals.

We recommend that flexibility is required to consider at which points in the value chain editorial control is most likely to be exercised, and therefore how best to measure diversity and influence.

The existing plurality public interest consideration for media enterprises is only about “the need, in relation to every different audience in the United Kingdom or in a particular area or locality of the United Kingdom, for there to be a sufficient plurality of persons with control of the media enterprises serving that audience”. On its face, this appears to be only about the number of persons having control, and the argument was put to the Court of Appeal in litigation following the decision in the Sky/ITVmerger that number was all it means. However, the Court of Appeal agreed with the Competition Commission that “plurality” in this context carries an implication of range and variety as well. We think it right for this broader idea of “plurality” to be retained in any future framework.

We … recommend that consumption metrics (in particular share, reach and multi-sourcing) form the foundation of a plurality assessment:
• Share of consumption (using single-sector measurement systems, where this is possible, and bespoke cross-media ‘share of references’) is a good proxy for measuring influence in the news media market.
• Reach (particularly cross-media, using bespoke quantitative research) and multi-sourcing (using the same) are good proxies for diversity of
viewpoints consumed.

We recommend that proxies of impact (and particularly perceived ‘importance’) should play a part of a broader assessment of plurality, noting that they are imperfect because one can only measure people’s conscious articulation and not actual effects.

As access to the web matures, and individuals perhaps start to consume longer-form material not from news documentaries or feature articles, but “educational” content, will it really be the case that “news and current affairs [as mediated by the "news media", will continue to] play the primary role in delivering the public policy goals [of shaping a well-functioning democratic society]“?

In a 2003 Lords Debate on the Communications Bill, the OU’s now Chancellor Lord Puttnam declared, “our key aim is to ensure that there is a range of competing voices available to citizens so that they are free to form their own opinions”. (How might he define the aim of “education” I wonder, especially when placed in the context of commercial entities delivering all parts of a person’s education?). Furthermore:

In a period of rapid economic, technological and ownership change, the one thing we cannot do is even begin to guess at who might or might not attempt to control this or that element of the media. What we can do, however, is refuse to contemplate any broadly unacceptable level of media concentration where each of the component parts is of significant size and reach in its own right.

What we need, therefore, is the ability to identify these concentrations as and when they occur, examine them in an analytical, fact-based way and ask whether they fit our definition of “unacceptable”. The drawback of relying on cross-media ownership rules is that they can all too easily be overtaken by changes in market circumstances, as Dr Howells acknowledged in response to a question from Andrew Lansley, the MP for South Cambridgeshire, during the Committee stage. We must also dispel the current fantasy that should unacceptable levels of ownership emerge, regulators can move swiftly to put the genie back in the bottle.

There are two ways of accomplishing what we propose, and we need both of them. One looks from the viewpoint of the consumer; the other from the viewpoint of the citizen. For the consumer, we have competition policy, and that is already built into the Bill. For the citizen, we have the public interest plurality test. Together, they represent a formidable duo, and they are both flexible and future-proof.

How does that read if we replace the media with formal education? How does that read if Coursera continues to grow, Pearson buys it, and then merges with News International?

If that scenario is too far off, consider one closer to home: the market-share that Pearson has over the delivery of educational content to schools and way it is assessed?

Written by Tony Hirst

November 13, 2012 at 2:29 pm

Posted in Policy

Tagged with

Twitter-Powered Lobbying?

leave a comment »

Spotting a tweet from @louwoodley around the science policy session at #solo12:

I started pondering the extent to which we might be able to generate twitter social interest maps around the interests of MPs, or the interests of folk who follow particular MPs.

One straightforward way would be to just create social interest maps around each map, and publish it as some sort of atlas. @tweetminster maintain a list of MPs, so it’s easy enough to pull that down and use it to drive a scraper. (By the by, I have in the past mapped out connections between MPs and political journalists and Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties). We can also do things like look at folk commonly followed by science policy lobbiests, such as @nesta_uk.

But what else might we be able to do?

From the TheyWorkForYou API, we can pull down lists of MPs, along with MP identifiers and IDs for their official Twitter accounts. The Committee membership information appears to be out of date, but this information is available on the UK Parliament website so it should be easy enough to generate a scraper for it on something like Scraperwiki. Munging the two together means we can get a list of Twitter IDs for MPs on any given committee. We could then use this list to generate a social interest map based on folk commonly followed by followers of N or more of the M members on a given committee. (For lobby detection purposes, there are actually 3 things we might care to look at: 1) people who follow at least N from M members of a particular committee; 2) folk commonly followed by members of the committee; 3) folk commonly followed by various subsets of the followers of the committee members, such as those who follow at least N from M).

Scraperwiki is also home to a scraper of All party Group (partial) memberships (partial because the original/scraped membership list is not necessarily complete?); we should thus be able to generate lists of tweeting MPs who are members of a particular APG, using them to do similar mapping exercises to those we might do with the committee membership lists.

Another approach might be to try to identify networks of (lobby) interest by generating a graph from donors to MPs, to try to tease out whether or not particular companies are supporting a particular MP. The ability to look up company director names on OpenCorporates means that if separate individuals are named as donors to the same MP, for example, we may be able to link the donors through companies they share directorships in? (Or does legislation around donations to MPs preclude this sort of thing happening? Can individuals make donations to MPs? Can companies? Charities? Etc? WHat are the limits? Does the Electoral Commission’s Party and Election Finance (PEF) Online registers service contain data on general donations to MPs? Is there a database version available anywhere of the Members’ Register of Financial Interests? Hmm, I wonder, even though the information is openly available, do the Protection of Freedoms Act tweaks to FOI mean we could actually demand a database/dataset version of this information with a a reasonable chance of success?).

Written by Tony Hirst

November 11, 2012 at 7:07 pm

Posted in Policy

Tagged with ,

Personal Data Exploitation – Recent Reports

with one comment

A Tesco advert I’ve noticed airing again recently shows how data collected around a Tesco Clubcard can be used to prepopulate an online shopping basket using Tesco Direct using just(?) the Clubcard number:

(It’s not quite that simple of course. Form a quick look at the Tesco website, you first need to register an account with Tesco.com, then I’m guessing there’s some simple name/address validation around whatever Clubcard number you enter before the data is revealed to you.)

Over July and August, BIS picked up a bit of publicity around the #midata policy initiative that seeks to encourage businesses to make consumer data and data products available back to the consumers who generate it (eg From Communications Data to #midata – with a Mobile Phone Data Example). You can get an idea about how BIS are trying to woo businesses into getting on board from the midata company briefing pack (July 2012).

Whilst the talk was all upbeat, a Jigsaw Research report for BIS on Potential consumer demand for midata was more circumspect:

Whilst consumers have nagging concerns about the security and privacy of their data online, the majority of those who choose to transact online currently put these concerns to the back of their mind. This is because most people perceive the benefits of being online to outweigh the risks; it is also due to most not fully understanding the nature of the ‘threat’ as they have limited understanding of how their data is currently collected and used by third parties as well as about what value it holds or how to protect themselves against misuse. People therefore avoid dwelling on these nagging and undetermined concerns and rely on the absence for most of any serious incident in their previous experience. In many respects, they can be described as ‘sleep walking’ in the age of data.

When initially shown an expression of the midata concept, consumers were bewildered about why this is being proposed and what difference it would make. As consumers typically define personal data as personal identity information, they struggled initially to identify what benefits the release of such data (which they already own/know) would have for them.

There is unlikely to be very much initial consumer interest in the overarching principle of companies releasing personal data for use by consumers. If anything, this news is likely to be received with suspicion until the benefits of this can be observed in practice.

Adoption of #midata services would therefore be driven by companies developing data related products and services rather than meeting a need articulated by consumers.

A new (O2 sponsored?) report from Demos – The Data Dialogue (Sept 2012) – on an O2 commissioned survey “looking into the public’s attitudes towards personal information”. Apparently, “[t]he Populus survey suggests that people share an increasing amount of information about themselves – and expect to share even more in the future. However, there is a crisis of confidence: the public is uncomfortable about the way personal information and behavioural data are collected by government and commercial companies. There is a danger that this loss of confidence will lead to people sharing less information and data, which would have detrimental results for individuals, companies and the economy.”

I haven’t had a chance – yet – to read the report (it’s only just come out…) but when I do I’ll probably also read it in the context of some other related reports on personal data (including rereading the Jigsaw consumer interest report around #midata). My gut feeling(?!;-) is that there isn’t really much concern (other than the sort of concern expressed when someone asks you whether you are concerned about something in a tone that suggests you should be…), rather there is a background level of disinterest and then mild confusion if forced to consider it at all…

Anyway, here are some of the other reports in the area:

Given the notion of trust is a big part of this, maybe I need to give my colleague Ray Corrigan’s recent presentation on Trust in the Digital Economy (conference page) a close reading too, along with this First Monday article by Bibi van den Berg and Simone van der Hof: What happens to my data? A novel approach to informing users of data processing practices

Hmmm… maybe I need to block a couple of days away somewhere to just get through them and try to plot out their various lobbying positions…?

PS a couple of other things caught my eye in the last day or too… via @jonhew and @martinstabe, an ICO ruling about whether database queries create new information for the purposes of FOI (answer: it depends how hard the query is to write..) and an Out-Law note on the legitimacy (or otherwise) of outsourcing the processing of sensitive personal data.

PPS Seems like the OU is a founder member of the new Centre for Research into Information, Surveillance and Privacy (CRISP). It launches at the OU in Milton Keynes on September 20th, with a panel session on “The Future of Information, Surveillance and Privacy Research” (press welcome, I believe…).

PPPS loosely related, from late last year, a news report on how Visa/Mastercard were planning to start selling on anonymised data to marketers… I’m not sure if/how this has progressed? Also loosely related: arXiv: A Theory of Pricing Private Data; OU study on Consumer Activity Data: Usages and Challenges.

Written by Tony Hirst

September 14, 2012 at 10:03 am

Posted in Policy

Tagged with

A Gust of Wind Blows Across HE…

with 6 comments

From various sources (@kavubob, @mweller via @peter_scott, @downes and others), I notice:

A couple more riffs on the above:

  • Pearson are playing multiple sides, offering testing for both the upstarts (eg Udacity) and the incumbents’ response (edX). They also have a major stake in school (i.e. K12) and further education content (textbooks, curricula) and assessment (e.g. EdExcel is a Pearson company), and they seem to be testing the waters with their own HE offerings in the form of Pearson College. Start to twitch a bit more if they start offering campus management solutions. Also look out for them bulking up their learning analytics offerings
  • Although the OU has started offering academic wrappers around imported vendor certificates, I don’t think an equivalent course wrapper yet serves as a way of wrapping informal and semi-formal online courses, such as offerings from P2PU, Coursera, Udacity etc etc. There is at least one “officially” offered MOOC on “Learning Design”, though… (One of the models I wanted to explore with the T151 Game Design and Development 10 point short course in its final presentation was a fully open presentation with an additional for credit component based around the submission of a portfolio for credit bearing assessment. The legacy would have been a 10 point wrapper for importing informal online course activity, “proven” using an OU presented course. Maybe there’ll be a similar sort of finesse around the Learning Design MOOC? I’d certainly hope so…
  • I note that the OU runs exams at a wide variety of examination centres (often in local colleges), so to an extent the OU already models the behaviour being adopted by edX. There are, however, a couple of notable differences: a) the OU, rather than a commercial operation such as Pearson, manages exams at local centres; b) the OU offers tutor and/or moderated forum based support to students on OU courses. Providing tutor/associate lecturer support (including face to face tutorials at local centres) to students on a 1:320 ratio or so is expensive though… I’m not sure how the costs associated with providing online moderation at a ratio of 1:100 or so scale up with increasing course sizes (eg when you factor in recruitment and briefing/training costs, as well as the costs of assessment/marking related moderation exercises etc).
  • I should probably say something about badges here, but don’t have the will to!

See also: Checking HE for Cracks.

PS re: the outsourcing of campus facilities management, I thought I’d better check… and came across this (Sussex Uni tenders for campus services) from May, 2012 (contract notice):

The University of Sussex is seeking bids to manage its estates and facilities services, which are run in-house at an annual cost of £20 million.

The move, to be completed by August next year, “is expected to bring wider market experience and expertise to the university to enable it to meet the increasing demands of a highly competitive environment”, according to a statement.

The story is still running… Unions left ‘in the dark’ over outsource plans. Companies in the ballpark – Carillion, maybe? eg they appear to have been contractors for construction works at UWE, Hertfordshire.

PPS Facilities talk reminds me of this, which relates in part to management of facilities data: Facilities and Equipment Sharing Network.

PPPS via @brlamb, Pearson ‘Education’ — Who Are These People?, which looks at some of the lobbying going around around US teacher performance assessment.

Written by Tony Hirst

September 6, 2012 at 6:15 pm

Checking HE for Cracks…

with one comment

As an HE policy blogger, apparently, (Higher education policy: 12 UK blogs worth bookmarking), I thought I’d log a handful of contextual links around Martin Weller’s playful post suggesting a conspiracy around How to dismantle a sector, stage 2 relating to some of the other possible cracks I’ve noticed recently (feel free to add more in the comments). These are offered in the spirit of conspiracy development, (cf. Umberto Eco’s Foucault’s Pendulum or The Prague Cemetry…), and provide a few more jigsaw pieces to play with…

By the by, on the conspiracies front, I notice via John Naughton the announcement of a bunch of postdoc research positions on Conspiracy and Democracy: History, Political Theory and Internet Research:

History
1. Conspiracy Theories in 19th-Century Europe
2. Cultural Transfer and Comparison: Europe and America
3. Anti-Semitic Conspiracy Theories in the Contemporary World

Political Theory
4. Rational Choice and Democratic Conspiracies
5. Ideals of Transparency and Suspicion of Democracy

The Internet
6. The impact of global networking on the nature, dissemination and impact of conspiracy theories.

What’s missing from that list (for me) is something on the network/graph structure of conspiracy theories…?

Written by Tony Hirst

September 3, 2012 at 10:06 am

The Opacity of Transparency

with 5 comments

A letter from the Prime Minister to Cabinet Ministers on July 7th, 2011 stated that:

transparency boards will be established in each of the key delivery departments (health, education, justice, work and pensions, transport).

I’ve just done a quick trawl and found:

but not corresponding boards for DfE (Education) or MoJ (Justice)? If you know where to find any more info about these boards (or links to sources explaining why they don’t exist) please let me know via the comments…

It does, however, look as if there may be a Research Sector Transparency Board on the way…(?)

There’s also a smattering of other transparency boards/panels:

(Again, please let me know via the comments if I’m missing any…)

All departments are also required to publish open data strategies – you can find links to them here: Cabinet Office list of Departmental Open Data Strategies.

I do wonder what all this alleged transparency means or makes possible though…?

Written by Tony Hirst

August 29, 2012 at 3:35 pm

Posted in opengov, Policy

Local Council Announcements via Newspapers, and Maybe Hyperlocal Blogs, Too…?

leave a comment »

In a post on local council declarations of designated public place, I remarked on the following clause in The Local Authorities (Alcohol Consumption in Designated Public Places) Regulations 2007:

“5. Before making an order, a local authority shall cause to be published in a newspaper circulating in its area a notice— (a)identifying specifically or by description the place proposed to be identified;“

and idly wondered: how is a “newspaper circulating in its area” defined?

In Statutory Instrument 2012 No. 2089, The Local Authorities (Executive Arrangements) (Meetings and Access to Information) (England) Regulations 2012, which comes into force on September 10th, 2012, there is the following note on interpretations to be used within the regulations:

“newspaper” includes—
(a) a news agency which systematically carries on the business of selling and supplying reports or information to the newspapers; and
(b) any organisation which is systematically engaged in collecting news—
(i) for sound or television broadcasts;
(ii) for inclusion in programmes to be included in any programme service within the meaning of the Broadcasting Act 1990(6) other than a sound or television broadcasting service within the meaning of Part 3 or Part 1 of that Act respectively; or
(iii) for use in electronic or any other format to provide news to the public by means of the internet; [my emphasis]

So, for the purposes of those regulations, a newspaper includes any organisation which is systematically engaged in collecting news for use in electronic or any other format to provide news to the public by means of the internet. (The 2007 regs interpretation section don’t clarify the meaning of “newspapers”.)

This presumably means, for example, that under regulation 14(2), my local hyperlocal blog, Ventnorblog, could request as a newspaper a copy of any of the documents available for public inspection following payment to the Isle of Wight Council of “postage, copying or other necessary charge for transmission”. Assuming Ventnorblog passes the test of (a) being an organisation, that (b) is systematically engaged in collecting news, and for backwards compatibility with other regulations can be show to be (c) circulating in the local council area.

If this interpretation of newspapers applies more widely, (for example, if this interpretation is now applied across outstanding Local Government regulations), it also suggests that whereas councils would traditionally have had to place an advert in their local (print) newspaper, now they can do it via a hyperlocal blog?

So this might also include announcements about Licensing of public entertainments [1(4)], or adult shops [2(2)], tattoo shops [13(6)], temporary markets [37(1)], traffic orders [17(2a)], etc etc

PS Hmm, I wonder, is there a single list somewhere detailing all the legislation that requires local councils to “publish in a newspaper circulating in the area”…?

PPS via @robandale, this Local Government Information Unit initiative on Reforming Statutory Notices. It links to a survey for local councils relating to “how many notices are produced, how much they cost, how effective you believe they are etc.” to try to get a baseline on current practice.

A quick peek at the Isle of Wight Council Armchair Auditor (as run by Ventnorblog and updated from Adrian Short’s original version) gives us an idea of how much the Isle of Wight Council spends on “Advertising and Publicity” in areas such as “Traffic Management” (so presumably statutory notices relating to roadworks, road closures etc?) with the local rag, the Isle of Wight County Press.

Presumably a search on OpenlyLocal’s council spending dashboard would turn up similar spending categories for other councils? (It could be quite interesting to try exploring that… I’m not sure if data on the OpenlyLocal spending dashboard has been updated lately, though? I think OpenSpending take their data from OpenlyLocal, so that isn’t much additional help. And the DCLG opendatacommunities.org site only has budget related finance data at the moment?)

Hmm..thinks.. you could get an idea of how much external spending burden different bits of legislation impose on councils from their spending data, couldn’t you?

Written by Tony Hirst

August 23, 2012 at 11:26 am

So What Counts as “Communications Data”?

leave a comment »

Picking up on a post by @nevali (Communications Data) that looks at the layered structure of internet based communications in general and a peek inside an SMTP session in particular, I idly wondered about the structure of a tweet and what, exactly, might count as the communications data part of it, as defined by the draft Communications Data Bill:

TO what extent can we make a fair comparison with something like the “communications data” associated with this sort of transaction?

(89/365) One day this will be extinct

Or how about a postcard?

See also: From Communications Data to #midata – with a Mobile Phone Data Example

PS via @smithsam, and in a similar light, a consideration of the anatomy of a Facebook message

PPS Given part of the #midata focus on transaction data, I’ve also started wondering about the extent to which financial transactions count as communications, and how different payment mechanisms might change the nature of the transaction. For example, two people meeting face-to-face engaging in a cash transaction, versus a purchase made via an online form using a credit card.

PPPS inspired by the anatomy of a Facebook message, I just posted a tweet via the Twitter web interface to see what the traffic looked like. It was an HTTP post that included the following:

Request URL:https://api.twitter.com/1/statuses/update.json
Request Method:POST
Status Code:200 OK
Request Payload
include_entities=true&include_cards=1&status=%40ousefulapi+wondering+what+data+the+twitter+web+client+sends+when+i+post+a+tweet&post_authenticity_token=*****
Response Headersview parsed
HTTP/1.1 200 OK
status: 200 OK
version: HTTP/1.1
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
content-encoding: gzip
content-length: 880
content-type: application/json; charset=utf-8
date: Thu, 23 Aug 2012 09:32:08 GMT

It also got a response, which looks a lot like the data around a particular status update. Presumably the response to an update message is a set of data back describing that accepted status update?

{"in_reply_to_status_id_str":null, "id_str":"238569301735526400", "contributors":null, "truncated":false, "created_at":"Thu Aug 23 09:32:08 +0000 2012", "in_reply_to_user_id":64672382, "in_reply_to_user_id_str":"64672382", "in_reply_to_screen_name":"ousefulAPI", "user":{"id":7129072, "url":"http:\/\/blog.ouseful.info", "profile_use_background_image":true, "verified":false, "profile_text_color":"000000", "contributors_enabled":false, "created_at":"Thu Jun 28 11:37:39 +0000 2007", "profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1195013164\/Picture_23_normal.png", "profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/1195013164\/Picture_23_normal.png", "statuses_count":32203,"utc_offset":0, "profile_background_image_url_https":"https:\/\/si0.twimg.com\/profile_background_images\/2508031\/rss_globe.png", "profile_sidebar_border_color":"87BC44", "default_profile":false, "show_all_inline_media":false, "name":"Tony Hirst", "friends_count":742, "location":"UK","id_str":"7129072", "profile_background_tile":true, "protected":false, "profile_sidebar_fill_color":"E0FF92", "geo_enabled":false, "listed_count":423, "follow_request_sent":false, "lang":"en", "description":"OU lecturer, mashup artist; Isle of WIght resident and #f1datajunkie", "profile_background_color":"9AE4E8", "screen_name":"psychemedia", "is_translator":false, "time_zone":"London", "notifications":false, "profile_background_image_url":"http:\/\/a0.twimg.com\/profile_background_images\/2508031\/rss_globe.png", "default_profile_image":false, "profile_link_color":"0000FF", "favourites_count":377, "following":false,"followers_count":3905},"retweeted":false, "coordinates":null, "in_reply_to_status_id":null, "geo":null, "source":"web", "entities":{"user_mentions":[{"name":"OUseful", "screen_name":"ousefulAPI", "id_str":"64672382","indices":[0,11],"id":64672382}], "hashtags":[], "urls":[]},"id":238569301735526400,"place":null, "retweet_count":0, "favorited":false, "text":"@ousefulapi wondering what data the twitter web client sends when i post a tweet"}

What you’ll notice is that whilst the update as sent was just a message string, the response identifies the sender (along with biographical data, geo data (possibly), a link to a photo (possibly), a real name, it also identifies the person to whom the tweet was sent (a Twitter convention is the tweets starting with @… are in some sense sent to @…*), and also (via user_mentions) would explicitly identify any other individuals mentioned within the body of the tweet (which as are mentioned as part of the content of the message. If the tweet began @foo @bar …, whilst @foo would be identified as some sort of addressee, @bar wouldn’t, although it would be identified as a user_mention**. However, we might assument that the tweet was addressed in some sense to both @foo and @bar, whereas “@foo Will chat to @bar later” only mentions @bar as content… And “@foo @bar said that too, I think”, whilst clunky, could be interpreted as mentioned @bar as content not suggested addressee (eg in sense of “@foo I think @bar said that too”).

* the tweet will only appear in the timeline of the person is sent to (and if they follow you?), although it is still public. Many clients also display as a timeline “user_mentions” tweets, so if your Twitter username appears anywhere in the body of a tweet, you should see the tweet, even if you don’t follow the person who sent it.

** If the tweet starts with another character, eg “.@foo” then @foo is no longer an addrssee in the sense of in_reply_to. From a communications data point of view, what’s fair game as far as communications data goes?

Because the update is sent via https, I don’t think you could argue the update was posted as a plaintext postcard? In the postal mail system, how does the law distinguish between messages placed inside an intercepted closed envelope and messages written on an intercepted postcard?

(Hmm – what;s the traffic associated with a TWitter DM I wonder?)

Written by Tony Hirst

August 22, 2012 at 11:26 pm

From Communications Data to #midata – with a Mobile Phone Data Example

with 6 comments

A BIS Press Release (Next steps making midata a reality) seems to have resulted in folk tweeting today about the #midata consultation that was announced last month. If you haven’t been keeping up, #midata is the policy initiative around getting companies to make “[consumer data] that may be actionable and useful in making a decision or in the course of a specific activity” (whatever that means) available to users in a machine readable form. To try to help clarify matters, several vignettes are described in this July 2012 report – Example applications of the midata programme – which plays the role of a ‘draft for discussion’ at the September midata Strategy Board [link?]. Here’s a quick summary of some of them:

  • form filling: a personal datastore will help you pre-populate forms and provide certified evidence of things like: proof of her citizenship, qualified to drive, passed certain exams and achieved certain qualifications, passed a CRB check, and so on. (Note: I’ve previously tried to argue the case for the OU starting to develop a service (OU Qualification Verification Service) around delivering verified tokens relating to the award of OU degrees, and degrees awarded by the polytechnics, as was (courtesy of the OU’s CNAA Aftercare Service), but after an initial flurry of interest, it was passed on. midata could bring it back maybe?
  • home moving admin: change your details in a personal “mydata” data store, and let everyone pick up the changes from there. Just think what fun you could have with an attack on this;-)
  • contracts and warranties dashboard: did my crApple computer die the week before or after the guarantee ran out?
  • keeping track of the housekeeping: bank and financial statement data management and reporting tools. I thought there already was software for doing this? do we use it though? I’d rather my bank improved the tools it provided me with?
  • keeping up with the Jones’s: how does my house’s energy consumption compare with that of my neighbours?
  • which phone? Pick a tariff automatically based on your actual phone usage. From going through this recently, the problem is not with knowing how I use my phone (easy enough to find out), it’s with navigating the mobile phone sites trying to understand their offers. (And why can’t Vodafone send me an SMS to say I’m 10 minutes away from using up this month’s minutes, rather than letting me go over? The midata answer might be an agent that looks at my usage info and tells me when I’m getting close to my limit, which requires me having access to my contract details in a machine readable form, I guess?

And here’s a BIS blog post summarising them: A midata future: 10 ways it could shape your choices.

(The #midata policy seems based on a belief that users want better access to data so they can do things with it. I’m not convinced – why should I have to export my bank data to another service (increasing the number of services I must trust) rather than my bank providing me with useful tools directly? I guess one way this might play out is that any data that does dribble out may get built around by developers who then sell the tools back to the data providers so they can offer them directly? In this context, I guess I should read the BIS commissioned Jigsaw Research report: Potential consumer demand for midata.)

Today has also seen a minor flurry of chat around the call for evidence on the Communications Data Bill, presumably because the closing date for responses is tomorrow (draft Communications Data Bill). (Related reading: latest Annual Report of the Interception of Communications Commissioner.) Again, if you haven’t been keeping up, the draft Communications Data Bill describes communications data in the following terms:

  • Communications data is information about a communication; it can include the details of the time, duration, originator and recipient of a communication; but not the content of the communication itself
  • Communications data falls into three categories: subscriber data; use data; and traffic data.

The categories are further defined in an annex:

  • Subscriber Data – Subscriber data is information held or obtained by a provider in relation to persons to whom the service is provided by that provider. Those persons will include people who are subscribers to a communications service without necessarily using that service and persons who use a communications service without necessarily subscribing to it. Examples of subscriber information include:
    – ‘Subscriber checks’ (also known as ‘reverse look ups’) such as “who is the subscriber of phone number 012 345 6789?”, “who is the account holder of e-mail account xyz@xyz.anyisp.co.uk?” or “who is entitled to post to web space http://www.xyz.anyisp.co.uk?”;
    – Subscribers’ or account holders’ account information, including names and addresses for installation, and billing including payment method(s), details of payments;
    – information about the connection, disconnection and reconnection of services which the subscriber or account holder is allocated or has subscribed to (or may have subscribed to) including conference calling, call messaging, call waiting and call barring telecommunications services;
    – information about the provision to a subscriber or account holder of forwarding/redirection services;
    – information about apparatus used by, or made available to, the subscriber or account holder, including the manufacturer, model, serial numbers and apparatus codes.
    – information provided by a subscriber or account holder to a provider, such as demographic information or sign-up data (to the extent that information, such as a password, giving access to the content of any stored communications is not disclosed).
  • Use data – Use data is information about the use made by any person of a postal or telecommunications service. Examples of use data may include:
    – itemised telephone call records (numbers called);
    – itemised records of connections to internet services;
    – itemised timing and duration of service usage (calls and/or connections);
    – information about amounts of data downloaded and/or uploaded;
    – information about the use made of services which the user is allocated or has subscribed to (or may have subscribed to) including conference calling, call messaging, call waiting and call barring telecommunications services;
    – information about the use of forwarding/redirection services;
    – information about selection of preferential numbers or discount calls;
  • Traffic Data – Traffic data is data that is comprised in or attached to a communication for the purpose of transmitting the communication. Examples of traffic data may include:
    – information tracing the origin or destination of a communication that is in transmission;
    – information identifying the location of equipment when a communication is or has been made or received (such as the location of a mobile phone);
    – information identifying the sender and recipient (including copy recipients) of a communication from data comprised in or attached to the communication;
    – routing information identifying equipment through which a communication is or has been transmitted (for example, dynamic IP address allocation, file transfer logs and e-mail headers – to the extent that content of a communication, such as the subject line of an e-mail, is not disclosed);
    – anything, such as addresses or markings, written on the outside of a postal item (such as a letter, packet or parcel) that is in transmission;
    – online tracking of communications (including postal items and parcels).

    To put the communications data thing into context, here’s something you could try for yourself if you have a smartphone. Using something like the SMS to Text app (if you trust it!), grab your txt data from your phone and try charting it: SMS analysis (coming from an Android smartphone or an IPhone). And now ask yourself: what if I also mapped my location data, as collected by my phone? And will this sort of thing be available as midata, or will I have to collect it myself using a location tracking app if I want access to it? (There’s an asymmetry here: the company potentially collecting the data, or me collecting the data…)

    It’s also worth bearing in mind that even if access to your data is locked down, access to the data of people associated with you might reveal quite a lot of information about you, including your location, as Adam Sadilek et al. describe: Finding Your Friends and Following Them to Where You Are (see also Far Out: Predicting Long-Term Human Mobility). My own tinkerings with emergent social positioning (looking at who the followers of particular twitter users also follow en masse) also suggest we can generate indicators about potential interests of a user by looking at the interests of their followers… Even if you’re careful about who your friends are, your followers might still reveal something about you you have tried not to disclose yourself (such as your birthday…). (That’s one of the problems with asymmetric trust models! Hmmm… could be interesting to start trying to model some of this… )

    Both of these consultations provide a context for reflecting on the extent to which companies use data for their own processing purposes (for a recent review, see What happens to my data? A novel approach to informing users of data processing practices), the extent to which they share this data in raw and processed form with other companies or law enforcement agencies, the extent to which they may use it to underwrite value-added/data-powered services to users directly or when combined with data from other sources, the extent to which they may be willing to share it in raw or processed form back with users, and the extent to which users may then be willing (or licensed) to share that data with other providers, and/or combine it with data from other providers.

    One of the biggest risks from a “what might they learn about me” point of view – as well as some of the biggest potential benefits – comes from the reconciliation of data from multiple different sources. Mosaic theory is an idea taken from the intelligence community that captures the idea that when data from multiple sources is combined, the value of the whole view may be greater than the sum of the parts. When privacy concerns are idly raised as a reason against the release of data, it is often suspicion and fears around what a data mosaic picture might reveal that act as drivers of these concerns. (Similar fears are also used as a reason against the release of data, for example under Freedom of Information requests, in case a mosaic results in a picture that can be used against national interests: eg D.E. Pozen, The Mosaic Theory, National Security, and the Freedom of Information Act and MP Goodwin, A National Security Puzzle: Mosaic Theory and the First Amendment Right of Access in the Federal Courts).

    Note that within a particular dataset, we might also appeal to mosaic theory thinking; for example, might we learn different things when we observe individual data records as singletons, as opposed to a set of data (and the structures and patterns it contains) as a single thing: GPS Tracking and a ‘Mosaic Theory’ of Government Searches. And as a consequence, might we want to treat individual data records, and complete datasets, differently?

    PS via this ORG post – Consulympics: opportunities to have your say on tech policies – which details a whole raft of currently open ICT related consultations in the UK, I am reminded of this ICO Consultation on the draft Anonymisation code of practice along with a draft of the anaoymisation code itself.

    Written by Tony Hirst

    August 22, 2012 at 1:07 pm

    Posted in Data, Paranoia, Policy, privacy

    Whither Transparency? This Week in Open Data

    with 2 comments

    I’m starting to feel as if I need to do myself a weekly round-up, or newsletter, on open data, if only to keep track of what’s happening and how it’s being represented. Today, for example, the Commons Public Accounts Committee published a report on Implementing the Transparency Agenda.

    From a data wrangling point of view, it was interesting that the committee picked up on the following point in its Conclusions and recommendations (thanks for the direct link, Hadley:-), whilst also missing the point…:

    2. The presentation of much government data is poor. The Cabinet Office recognises problems with the functionality and usability of its data.gov.uk portal. Government efforts to help users access data, as in crime maps and the schools performance website, have yielded better rates of access. But simply dumping data without appropriate interpretation can be of limited use and frustrating. Four out of five people who visit the Government website leave it immediately without accessing links to data. So there is a clear benefit to the public when government data is analysed and interpreted by third parties – whether that be, for example, by think-tanks, journalists, or those developing online products and smartphone applications. Indeed, the success of the transparency agenda depends on such broader use of public data. The Cabinet Office should ensure that:
    – the publication of data is accessible and easily understood by all; and
    – where government wants to encourage user choice, there are clear criteria to determine whether government itself should repackage information to promote public use, or whether this should be done by third parties.

    A great example of how data not quite being published consistently can cause all sorts of grief when trying to aggregate it came to my attention yesterday via @lauriej:

    It leads to a game where you can help make sense of not quite right column names used to describe open spending data… (I have to admit, I found the instructions a little hard to follow – a screenshot walked through example would have helped? It is, after all, largely a visual pattern matching exercise…)

    From a spend mapping perspective, this is also relevant:

    6. We are concerned that ‘commercial confidentiality’ may be used as an inappropriate reason for non-disclosure of data. If transparency is to be meaningful and comprehensive, private organisations providing public services under contract must make available all relevant public information. The Cabinet Office should set out policies and guidance for public bodies to build full information requirements into their contractual agreements, in a consistent way. Transparency on contract pricing which is often hidden behind commercial confidentiality clauses would help to drive down costs to the taxpayer.

    And from a knowing “what the hell is going on?” perspective, there was also this:

    7. Departments do not make it easy for users to understand the full range of information available to them. Public bodies have not generally provided full inventories of all of the information they hold, and which may be available for disclosure. The Cabinet Office should develop guidance for departments on information inventories, covering, for example, classes of information, formats, accuracy and availability; and it should mandate publication of the inventories, in an easily accessible way.

    The publication of government department open data strategies may go some way to improving this. I’ve also been of a mind that more accessible ways of releasing data burden reporting requirements could help clarify what “working data” is available, in what form, and the ways in which it is routinely being generated and passed between bodies. Sorting out better pathways between FOI releases of data and the then regular release of such data as opendata is also something I keep wittering on about (eg FOI Signals on Useful Open Data? and The FOI Route to Real (Fake) Open Data via WhatDoTheyKnow).

    From within the report, I also found a reiteration of this point notable:

    This Committee has previously argued that it is vital that we and the public can access data from private companies who contract to provide public services. We must be able to follow the taxpayers’ pound wherever it is spent. The way contracts are presently written does not enable us to override rules about commercial confidentiality. Data on public contracts delivered by private contractors must be available for scrutiny by Parliament and the public. Examples we have previously highlighted include the lack of transparency of financial information relating to the Private Finance Initiative and welfare to work contractors.

    …not least because data releases from companies is also being addressed on another front, midata, most notably via the recently announced BIS Midata 2012 review and consultation [consultation doc PDF]. For example, the consultation document suggests:

    1.10 The Government is not seeking to require the release of data electronically at this stage, and instead is proposing to take a power to do so. The Secretary of State would then have to make an order to give effect to the power. An order making power, if utilised, would compel suppliers of services and goods to provide to their customers, upon request, historic transaction/ consumption data in a machine readable format. The requirement would only apply to businesses that already hold this information electronically about individual consumers.
    1.11. Data would only have to be released electronically at the request of the consumer and would be restricted to an individual’s consumption and transaction data, since in our view this can be used to better understand consumers’ behaviour. It would not cover any proprietary analysis of the data, which has been done for its own purposes by the business receiving the request.

    (More powers to the Minister then…?!) I wonder how this requirement would extend rights available under the Data Protection Act (and why couldn’t that act be extended? For example, Data Protection Principle 6 includes “a right of access to a copy of the information comprised in their personal data” – couldn’t that be extended to include transaction data, suitably defined? Though I note 1.20. There are a number of different enforcement bodies that might be involved in enforcing midata. Data protection is enforced by the Information Commissioner’s Office (ICO), whilst the Office of Fair Trading (OFT), Trading Standards and sector regulators currently enforce consumer protection law. and Question 17: Which body/bodies is/are best placed to perform the enforcement role for this right?) There are so many bits of law around relating to data that I don’t understand at all that I think I need to do myself an uncourse on them… (I also need to map out the various panels, committees and groups that have an open data interest… The latest, of course, is the Open Data User Group (ODUG), the minutes of whose first meeting were released some time ago now, although not in a directly web friendly format…)

    The consultation goes on:

    1.18. For midata to work well the data needs be made available to the consumer in electronic format as quickly as possible following a request (maybe immediately) and as inexpensively as possible. This will minimise friction and ensure that consumers are able to access meaningful data at the point it is most useful to them. This requirement will only cover data that is already held electronically at the time of the request so we expect that the time needed to respond to a consumer’s request will be short – in many cases instant

    Does the Data Protection Act require the release of data in an electronic format, and ideally a structured electronic format (i.e. as something resembling a dataset? The recent Protection of Freedoms Act amended the FOI Act with language relating to the definition and release of datasets, so I wonder if this approach might extend elsewhere?

    Coming at the transparency thing from another direction, I also note with interest (via the BBC) that MPs say all lobbyists should be on new register:

    All lobbyists, including charities, think tanks and unions, should be subject to new lobbying regulation, a group of MPs have said. They criticised government plans to bring in a statutory register for third-party lobbyists, such as PR firms, only. They said the plan would “do nothing to improve transparency”. Instead, the MPs said, regulation should be brought in to cover all those who lobby professionally.

    This is surely a blocking move? If we can’t have a complete register, we shouldn’t have any register. So best not to have one at all for a year or two.. or three… or four… Haven’t they heard of bootstrapping and minimum viability releases?! Or maybe I got the wrong idea from the lead I took from the start of the news report? I guess I need to read what the MPs actually said in the Political and Constitutional Reform – Second Report: Introducing a statutory register of lobbyists.

    PS For a round-up of other recent reports on open data, see OpenData Reports Round Up (Links…).

    PPS This is also new to me: new UK Data Service “starting on 1 October 2012, [to] integrate the Economic and Social Data Service (ESDS), the Census Programme, the Secure Data Service and other elements of the data service infrastructure currently provided by the ESRC, including the UK Data Archive.”

    Written by Tony Hirst

    August 1, 2012 at 9:46 am

    Posted in Data, Policy

    Tagged with , ,

    Follow

    Get every new post delivered to your Inbox.

    Join 339 other followers