Time, Yet, for Twitter Captions on BBC iPlayer Content?

A couple of days ago, the Guardian reported a quote from Dimblebobs about Question Time being bigger than X-Factor on Twitter (How Question Time got as big as The X Factor on Twitter); so when are we going to see optional Twitter captions made available, either in real time, or on catch-up services such as iPlayer? (If you haven’t been keeping up: Twitter captions/subtitles are captions generated as an overlay for a video video based on tweets from members of a particular Twitter list, or using a particular hashtag. (In the future, it might also be worth considering the capture of tweets based on location?) Martin Hawksey has been developing several tools in this area: Twitter subtitling. His most recent demonstration – iTitle: Full circle with Twitter subtitle playback in YouTube (ALT-C 2010 Keynotes) – describes how videos of the ALT-C 2010 keynotes have been recently republished along with searchable Twitter captions).

As Martin hinted at in What they were saying: Leaders debate on BBC iPlayer with twitter subtitles from parliamentary candidates and in the comments to that post, the volume and rate of production of tweets for a popular live event may be too great to display them all via the caption feed and still give the viewer time to read them. Which means, for heavy volumne backchannels, tweets need filtering or sampling (ideally in a way that avoids undue bias?) in order to limit the number (and quality?) of tweets that are actually displayed as captions. So what are the options?

First of all, we should distinguish whether we intend to work on a live feed, or an archive feed. An archive feed means that samples or filters can be in part tuned according to a post hoc analysis of all the tweets; whereas the live feed may either work in a stateless way, judging whether or not to show any individual tweet based solely on its own merits, (for example, showing any particular tweet with given, fixed probability p), or based at least in part on the history of tweets already observed.

I think we should also distinguish between sampling of Tweets, versus filtering them. By sampling, I mean selecting each individual tweet according to probability p independently of any other information; by filtering, I mean selecting a tweet based on it or its metadata containing a particular term (for example: only selecting tweets from certain individuals, block tweets starting RT, and so on). Note that both sampling and filtering may feature in the selection of tweets for display, in either order (sample, then filter, or filter then sample), or in more elaborate combinations (sample, filter, sample, for example).

So what strategies are there..? Note that this isn’t a very principled list (been a long day!), and it is likely to be far from complete, but it’s a start, and something to mull over at least…

Sampling
– display every n‘th tweet;
– display the most recently received tweet in the last x seconds every y seconds;
– display any given tweet with fixed probability, p:

Historyless Filtering
– filter out rewteets (items starting RT);
– filter out tweets sent to a person (tweets starting @). (Note that this does mean we limit the extent to which conversations might be displayed);
– filter tweets based on some function of the number of friends and or followers a sender has;

History-based Filtering
– filter based on the number of tweets the user has already sent;
– filter based on properties of the hashtag network (for example, number of hashtaggers following an individual. I have classed this as a history-based filter because we need some knowledge of the hashtag community, generated from a history of tweets, in order to calculate hashtag network metrics;
– filter based on the extent to which tweets are appratently part of a conversation thread (e.g. construct a conversation graph in which @a mentions @b and @b mentions @a, and select all conversations greater than a particular length. Note that we might combine this condition with other conditions, such as “where @a and @b share more than m common followers”.

Note that the filtering approach may be used to either filter out tweets and prevent them from being displayed, or select tweets according a particular set of criteria that means they should be displayed. In addition, filtering may be deterministic or combined with a probabilistic sampling mechanism. For example, we may choose to display a tweet with probability p where p is a function of some ranking factor with value f. An alternative approach might be to generate a score for each tweet based on one or more ranking factor as described in the filter considerations above, rank the tweets by score, and then display the one with the highest score at any given time.

The history based approach may be used in real time, making selections based on the tweets observed (and/or maybe just the tweets displayed) so far (until now history), or, in cases where a Twitter caption file is being generated after the fact, through analysis of the whole hashtag archive corpus (total archive). So for example, it might be that the caption file is generated after the event for use only by catch-up viewers, with the expectation that live viewers would be able to entertain themselves direclty from a live Twitter feed in their own client.

BBC iPlayer Desktop Application

So with iPlayer hitting 1 year old last week “iPlayer Day: A year under the hood”), it’s great to see that there is now a cross-platform iPlayer downloader available (albeit as a beta) from the BBC iPlayer Labs (download the BBC iPlayer Desktop here) – a BBC news story about the release is available atBBC iPlayer now available on Mac .

According to Andrew Shorten, an Adobe Air evangelist: “The application was built using the Flex 3 framework, runs on top of AIR 1.5 and makes use of the Flash Media Rights Management Server (FMRMS) to DRM-protect content which is downloaded to the user’s desktop.”

So what?

So because it’s an Adobe Air app, it runs on Macs, Windows and Linux boxes…

If you’ve signed up for the BBC Labs experimental services:

(hmm – programme recommendations…)

…you’ll that some iPlayer programmes now have a download link… :-)

Here are a couple more screenshots – downloading:

The “General Settings” panel allows you, among other things, to set a limit on the amount of space that can be used for storing downloaded programmes. The original limit is set quite sparingly, at 0GB… I downloaded an episode of Dr Who (or as it is know in BBC speak, Doctor Who), at 200MB (0.2GB) without issue but trying to download another programme met with an error until more memory space was allocated. (So if iPlayer desktop refuses to download a programme, check you have enough free allocated space as far as iPlayer is concerned…)

You can also specify a download location (or opt for the sensible default). If you’re on an eee PC with a limited memory allocation, plugging in a USB memory stick and using that as the target destination seems to work fine (hat tip, Liam;-). It so

(“Allow BBC iPlayer Desktop to send usage statistics to the BBC” – that’ll tie in with the programme recommendation engine, possibly, and maybe also some future social features around the iPlayer itself? See also iPlayer 3: New Social Functions Outlined For Q1-Q2 2009. It’s worth bearing in mind that iPlayer programmes typically come with a “SHARE” link already, and some of them are even available with an embed code: Embedding BBC iPlayer Music Videos – Foals. A lot of the content on the Britain from the Air website is embeddable in your own pages – and I guess also in Google Earth, if this example is anything to go by: BBC Class Clips Video – in Google Earth.)

“Parental Guidance” controls:

iplayer desktop parental guidance

And finally, downloaded content menu:

Also on the iPlayer front, it looks like next year could be an exciting year. A BBC review document published last week opened up the possibility that the iPlayer platform might be offered to other public service broadcasters, and potentially other agencies too (the OU was noticeably not mentioned anywhere in the report though…): “BBC’s iPlayer ‘could be shared’ “ (read the full report here: “Public service partnerships: Helping sustain UK PSB”).

Also announced last week was the poosibility of an iPlayer hosting broadband set-top box, code named “Canvas” (Count them – three IPTV Platforms, BBC, ITV and BT plan broadband Freeview service).