So What Counts as “Communications Data”?

Picking up on a post by @nevali (Communications Data) that looks at the layered structure of internet based communications in general and a peek inside an SMTP session in particular, I idly wondered about the structure of a tweet and what, exactly, might count as the communications data part of it, as defined by the draft Communications Data Bill:

TO what extent can we make a fair comparison with something like the “communications data” associated with this sort of transaction?

(89/365) One day this will be extinct

Or how about a postcard?

See also: From Communications Data to #midata – with a Mobile Phone Data Example

PS via @smithsam, and in a similar light, a consideration of the anatomy of a Facebook message

PPS Given part of the #midata focus on transaction data, I’ve also started wondering about the extent to which financial transactions count as communications, and how different payment mechanisms might change the nature of the transaction. For example, two people meeting face-to-face engaging in a cash transaction, versus a purchase made via an online form using a credit card.

PPPS inspired by the anatomy of a Facebook message, I just posted a tweet via the Twitter web interface to see what the traffic looked like. It was an HTTP post that included the following:

Request URL:
Request Method:POST
Status Code:200 OK
Request Payload
Response Headersview parsed
HTTP/1.1 200 OK
status: 200 OK
version: HTTP/1.1
cache-control: no-cache, no-store, must-revalidate, pre-check=0, post-check=0
content-encoding: gzip
content-length: 880
content-type: application/json; charset=utf-8
date: Thu, 23 Aug 2012 09:32:08 GMT

It also got a response, which looks a lot like the data around a particular status update. Presumably the response to an update message is a set of data back describing that accepted status update?

{"in_reply_to_status_id_str":null, "id_str":"238569301735526400", "contributors":null, "truncated":false, "created_at":"Thu Aug 23 09:32:08 +0000 2012", "in_reply_to_user_id":64672382, "in_reply_to_user_id_str":"64672382", "in_reply_to_screen_name":"ousefulAPI", "user":{"id":7129072, "url":"http:\/\/", "profile_use_background_image":true, "verified":false, "profile_text_color":"000000", "contributors_enabled":false, "created_at":"Thu Jun 28 11:37:39 +0000 2007", "profile_image_url_https":"https:\/\/\/profile_images\/1195013164\/Picture_23_normal.png", "profile_image_url":"http:\/\/\/profile_images\/1195013164\/Picture_23_normal.png", "statuses_count":32203,"utc_offset":0, "profile_background_image_url_https":"https:\/\/\/profile_background_images\/2508031\/rss_globe.png", "profile_sidebar_border_color":"87BC44", "default_profile":false, "show_all_inline_media":false, "name":"Tony Hirst", "friends_count":742, "location":"UK","id_str":"7129072", "profile_background_tile":true, "protected":false, "profile_sidebar_fill_color":"E0FF92", "geo_enabled":false, "listed_count":423, "follow_request_sent":false, "lang":"en", "description":"OU lecturer, mashup artist; Isle of WIght resident and #f1datajunkie", "profile_background_color":"9AE4E8", "screen_name":"psychemedia", "is_translator":false, "time_zone":"London", "notifications":false, "profile_background_image_url":"http:\/\/\/profile_background_images\/2508031\/rss_globe.png", "default_profile_image":false, "profile_link_color":"0000FF", "favourites_count":377, "following":false,"followers_count":3905},"retweeted":false, "coordinates":null, "in_reply_to_status_id":null, "geo":null, "source":"web", "entities":{"user_mentions":[{"name":"OUseful", "screen_name":"ousefulAPI", "id_str":"64672382","indices":[0,11],"id":64672382}], "hashtags":[], "urls":[]},"id":238569301735526400,"place":null, "retweet_count":0, "favorited":false, "text":"@ousefulapi wondering what data the twitter web client sends when i post a tweet"}

What you’ll notice is that whilst the update as sent was just a message string, the response identifies the sender (along with biographical data, geo data (possibly), a link to a photo (possibly), a real name, it also identifies the person to whom the tweet was sent (a Twitter convention is the tweets starting with @… are in some sense sent to @…*), and also (via user_mentions) would explicitly identify any other individuals mentioned within the body of the tweet (which as are mentioned as part of the content of the message. If the tweet began @foo @bar …, whilst @foo would be identified as some sort of addressee, @bar wouldn’t, although it would be identified as a user_mention**. However, we might assument that the tweet was addressed in some sense to both @foo and @bar, whereas “@foo Will chat to @bar later” only mentions @bar as content… And “@foo @bar said that too, I think”, whilst clunky, could be interpreted as mentioned @bar as content not suggested addressee (eg in sense of “@foo I think @bar said that too”).

* the tweet will only appear in the timeline of the person is sent to (and if they follow you?), although it is still public. Many clients also display as a timeline “user_mentions” tweets, so if your Twitter username appears anywhere in the body of a tweet, you should see the tweet, even if you don’t follow the person who sent it.

** If the tweet starts with another character, eg “.@foo” then @foo is no longer an addrssee in the sense of in_reply_to. From a communications data point of view, what’s fair game as far as communications data goes?

Because the update is sent via https, I don’t think you could argue the update was posted as a plaintext postcard? In the postal mail system, how does the law distinguish between messages placed inside an intercepted closed envelope and messages written on an intercepted postcard?

(Hmm – what;s the traffic associated with a TWitter DM I wonder?)

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: