Searching the Backchannel – Martin Bean, OU VC, Twitter Captioned at JISC10

Other Martin’s been at it again, this time posting JISC10 Conference Keynotes with Twitter Subtitles.

The OU’s VC, Martin Bean, gave the opening keynote, and I have to admit it really did make me feel that the OU is the best place for me to be working at the moment :-)

… though maybe after embedding that, my days are numbered…? Err…

Anyway, I feel like I’ve not really been keeping up with other Martin’s efforts, so here’s a quick hack a placemarker/waypoint in one of the directions I think the captioning could go – deep search linking into video streams (where deep linking is possible).

Rather than search the content, we’re going to filter captions for a particular video, in this case the twitter caption file from Martin (other, other Martin?!) Bean’s #JISC10 opening keynote. The pipework is simple – grab the URL of the caption file and a “search” term, parse the captions into a feed with one item per caption, then filter on the caption content. I added a little Regular Expression block just to give a hint as to how you might generate a deeplink into content based around the tart time of the caption:

Filter based searching caption

You can find the pipe here: Twitter caption search

One thing to note is that it may take some time for someone to tweet what someone has said. If we had a transcript caption file (i.e. a timecoded transcript of the presentation) we might be able to work out the “mean time to tweet” for a particular event/twitterer, in which case we could backdate timestamps to guess the actual point in the video that a person was tweeting about. (I looked at using auto-genearated transcript files from Youtube to trial this, but at the current time, they’re rubbish. That said, voice search on my phone was rubbish a year ago, but by Christmas it was working pretty well, so the Goog’s algorithms learn quickly, especially where error signals are available. So bear in mind that if you do post videos to Youtube, and you can upload a caption file, as well as helping viewers, you’ll also be helping train Google’s auto-transcription service (because it’ll be able to compare the result of auto-transcription with your captions file…. If you’re the Goog, there are machine learning/supervised learning cribs everywhere!))

(Just by the by, I also wonder if we could colour code captions to identify in a different colour tweets that refer to the content of an earlier tweet/backchannel content, rather than the foreground content of the speaker?)

Unfortunately, caption files on Youtube, which does support deep time links into videos, only appear to be available to video owners (Youtube API: Captions), so I can’t do a demo with Youtube content… and I so should be doing other things that I don’t have the time right now to look at what would be required deeplinking elsewhere…:-(

PS The captioner tool can be found here: https://mashe.hawksey.info/ititle  http://www.rsc-ne-scotland.org.uk/mashe/ititle/

Martin Hawksey, whose work this is, has described the evolution of the app in a series of several posts here: http://www.rsc-ne-scotland.org.uk/mashe/?s=twitter+subtitles

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

8 thoughts on “Searching the Backchannel – Martin Bean, OU VC, Twitter Captioned at JISC10”

    1. That is v cool :-) And using it makes me think – should there be an inbuilt offset of 0.5s or so, so that you see the searched for caption arrive? Too long and it might be confusing – not sure where the sweet spot would be…

      Btw, in your post, I love the pair programming reference, though I do feel at times as if I am having all the irresponsible fun, whilst you are lumbered with the coding?! ;-)

  1. Matching the subtitle file and video is a bit hit and miss right now. Results are pulled from Twapper Keeper and Twitter to the nearest minute so there is already a +/- 60 sec margin of error. Having some fine tuning would be a useful feature.

    (I was just reviewing the new Martin Bean video integration and came across an nice little example of the benefit of the backchannel. If you search for ‘debill’ llordlama tweets the question which viewers can’t hear ;-)

    [I like driving, happy for you to keep navigating ;-)]

  2. Which Martin am I? Other other other Martin? I think at least one of us should change our names.
    V good method of archiving btw.

Comments are closed.

%d bloggers like this: