Trying to find useful things to do with emerging technologies in open education and data journalism. Snarky and sweary to anyone who emails to offer me content for the site.
I don’t often do posts where I just link to or re-present content that appears elsewhere on the web, but I’m going to make an exception in this case, with a an extended preview to a link on Martin Hawksey’s MASHe blog…
Anyway, whilst I was watching Virtual Revolution over the weekend (and pondering the question of Broadcast Support – Thinking About Virtual Revolution) I started thinking again about replaying twitter streams alongside BBC iPlayer content, and wondering whether this could form part of a content enrichment strategy for OU/BBC co-productions.
which leads to a how to post on Twitter powered subtitles for BBC iPlayer in which Martin “come[s] up with a way to allow a user to replay a downloaded iPlayer episode subtitling it with the tweets made during the original broadcast.”
This builds on my Twitter powered subtitling pattern to create a captions file for downloaded iPlayer content using the W3C Timed Text Authoring Format. A video on the Martin’s post shows the twitter subtitles overlaying the iPlayer content in action.
AWESOME :-)
This is exactly it’s worth blogging half baked ideas – because sometimes they come back better formed…
So anyway, the next step is to work out how to make full use of this… any ideas?
PS I couldn’t offhand find any iPlayer documentation about captions files, or the content packaging for stuff that gets downloaded to the iPlayer desktop – anyone got a pointer to some?
The workflow is now as follows. Suppose you have a recording of an event that people were tweeting through using a particular hashtag, and you want to annotate the recording using the tweets made at the time as subtitles.
Tweak the number of results on the page and the date setting (if necessary):
If you only want tweets FROM a particular person, limit the search that way too:
If the results you want to convert to subtitles are on “older” search results pages, navigate to the required results page
When you have a results page containing the tweets you want to convert to subtitles, grab the URL of that results page and copy it into the subtitler form at http://ouseful.open.ac.uk/twitterSubtitles.php
Optionally, if you want to specify the tweet that you want to be the first subtitle, copy its URL (that is, the URL that is pointed to by the View tweet link for that tweet:
Optionally again, if you want to specify the tweet on the results page that you want to be the lastsubtitle, grab it’s URL and paste it into the form.
Generate the subtitles:
Save the page as a text file with the suffix .sub:
You can now upload the .sub subtitle file to Youtube.
So hopefully, that’s a little easier? (Note that there is a also a bookmarklet on the subtitler page that will create the subtitle file directly from a Twitter advanced search results page.)
PPS if anyone fancies converting the Javascript that generates the subtitles in the browser to PHP that will do the processing on the server, please feel free to post the code back here as a comment ;-)
PPPS Woudln’t it be good if CoverItLive offered an exportable subtitle file from previous events? In the meantime, does anyone know if it’s possible to get an RSS feed of posts from previous CoverItLive event commentaries?
PPPs See also: Twitterprompter?, discussing several possible use cases for Twitter in a live presentation environment.
One of the things that attracts me to serialised feeds (as well as confusing the hell out of me) is the possibility of letting people subscribe to, and add, comments in “relative time”…
… that is, as well as viewing the content via a serialised feed, the comments feed should also be serialised (with timestamps for each comment calculated relative to the time at which the person commenting started receiving the serialised feed).
Applying this to the idea of tweeted Youtube movie subtitles (Twitter Powered Subtitles for Conference Audio/Videos on Youtube) in which every tweet made during a presentation at or around that presentation becomes a subtitle on a recording of that presentation, it strikes me that a similar model is possible.
That is, different individuals could watch a Youtube video at different times, tweeting along as they do so, and then these tweets could be aggregated according to relative timestamps to provide a single, combined set of subtitles.
So how might this work in practice? Here’s a thought experiment run through…
Firstly, it’d probably be convenient to set up a twitter account to send the tweets to (say @example, for example).
(Alan Levine reminded me about flickr machine tags earlier today, which are maybe also worth considering in this respect, e.g. as a source of inspiration for a tagging convention?)
Grab a ctrl-C copy of the phrase @example #yt:tBmFzF8szpo for quick pasting into a new tweet, and then start watching the video, tweeting along as you do so…
To generate your subtitle feed, you can then do a search based on Tweets from your username (which would be @psychemedia in my case) to e.g. @example, with hashtag #yt:tBmFzF8szpo, and maybe also using a date range.
The actual subtitle file generator could then pull in several different subtitle feeds from separate people, relativise their timestamps relative to the time of the first tweet (which could maybe use a keyword, too – such as “START”: @example START #yt:tBmFzF8szpo;-) and then produce an aggregated subtitle feed.
As more people watched the video (maybe including the subtitles to date), their feeds could be added to the aggregating subtitle file generator, and the subtitle file updated/refreshed.
Individuals could even rewatch the video and create new feeds for themselves to join in the emerging conversation…
(Okay, so it’s maybe slower than just reading through the comments, having to replay the video in real time to read the tweets, but this is a sort of thought experiment, right, albeit one that can be implemented quite easily…;-)
So I’m thinking – if live tweets from an event can be associated with a video of an event (maybe because the video is posted with a link to a (now out of date!) upcoming record for that event in order to anchor it in time) then being able to search the tweets as captions/subtitles provides a crib for deeplink searching into the video? (But then, I guess the Goog is looking at audio indexing anyway?)
PPS I just came across another tool for adding subtitles to Youtube videos, as well as videos from other online video sites – overstream.net:
It’s worth looking at, maybe?
PPPS see also Omnisio, a recent Google acquisition that offers “select clips from videos you find on YouTube and other video sites, and easily post them on your profile page or blog. Even better, you and your friends can add comments directly in the video!”.
And there’s more: “With Omnisio you make and share your own shows by assembling clips from different videos.” Roll on the remixes :-)
Chatting to @liamgh last week, I mentioned how i was stumped for an easy way to do this. He suggested creating a subtitles feed, and then uploading it to Youtube, along with the audio recording (doh!).
The trick is to upload a textfile with lines that look something like this: 0:03:14.159
Text shown at 3 min 14.159 sec for an undefined length of time.
0:02:20.250,0:02:23.8
Text shown at 2 min 20.25 sec, until 2 min 23.8 sec
Secondly – getting the list of tweets hashtagged with #carter over the period Lord Carter was speaking (i.e. the period covered by the video). For the original proof of concept, I used the tweets from the spreadsheet of scraped tweets that @benosteen grabbed for me, though it later occurred to me I could get the tweets direct from a Twitter search feed (as I’ll show in a minute).
The question now was how to get the timecode required for the subtitles file from the timestamp associated with each tweet. Note here that the timecode is the elapsed time from the start of the video. The solution I came up with was to convert the timestamps to universal time (i.e. seconds since midnight on January 1st 1970) and then find the universal time equivalent of the first tweet subtitle; subtracting this time from the universal time of all the other tweets would give the number of seconds elapsed from the first tweet, which I could convert to the timecode format.
At this point, it’s probably worth pointing out that I didn’t actually need to call on @benosteen’s tweetscraper – I could just use the Twitter search API (i.e. the Twitter advanced search feed output) to grab the tweets. How so? Like this:
Looking at the results of this query, we see the timing is a little off – we actually need results from 8.30am, the actual time of the event:
Which is where this comes into play – searching for “older” results:
If you click on “Older” you’ll notice a new argument is introduced into the search results page URL – &page=:
…which means that by selecting appropriate values for rpp= and page= we can tunnel in on the results covering from a particular time by looking at “older” results pages, and grabbing the URL for the page of results covering the time period we want:
NB while I’m at it, note that there’s a corollary hack here that might
come in useful somewhere, or somewhen, else – getting a Twitter search feed into a Google spreadsheet (so we can, for example,process it as a CSV file published from the spreadsheet):
Okay – back to the main thread – and a tweak to the pipe to let us ingest the feed, rather than the spreadsheet CSV:
Just by the by, we can add a search front end to the pipe if we want:
and construct the Twitter search API URI accordingly:
(The date formatter converts the search date to the format required by the Twitter search API; it was constructed according to PHP: strftime principles.)
Ok – so let’s recap where we’re at – we’ve now got a pipe that will give us universal timecoded tweets (that’s not so far for such a long post to here, is it?!) If we take the JSON feed from the pipe into an HMTL page, we can write a little handler that will produce the subtitle file from it:
Here’s the code to grab the pipe’s JSON output into an HTML file:
var pipeUrl="http://pipes.yahoo.com/pipes/pipe.run?_id=Dq_DpygL3hGV7mFEAVYZ7A&aqs";
function ousefulLoadPipe(url){
var d=document;
var s=d.createElement('script');
s.type='text/javascript';
var pipeJSON=url+"&_render=json&_callback=parseJSON";
s.src=pipeJSON;
d.body.appendChild(s);
}
ousefulLoadPipe(pipeUrl);
Here’s the JSON handler:
function parseJSON(json_data){
var caption; var timestamp=0; var mintime=json_data.value.items[0]['datebuilder'].utime;
for (var i=0; itimestamp) mintime=timestamp;
}
for (var j=json_data.value.items.length-1; j>=0; j--) {
caption=json_data.value.items[j]['title'];
timestamp=1*json_data.value.items[j]['datebuilder'].utime;
if (j>0) timeEnd=(1*json_data.value.items[j-1]['datebuilder'].utime)-3; else timeEnd=10+1*json_data.value.items[j]['datebuilder'].utime;
if (timeEnd<timestamp) timeEnd=timestamp+2;
timecode=getTimeCode(timestamp-mintime);
timeEnd=getTimeCode(timeEnd-mintime);
var subtitle=timecode+","+timeEnd+" "+caption+"<br/><br/>";
document.write(subtitle);
}
}
Copy and paste the output into a text file and save it with the .sub suffix, to give a file which can then be uploaded to Youtube.
So that’s the subtitle file – how about getting the audio into Youtube? I’d already grabbed an audio recording of Carter’s presentation using Audacity (wiring the “headphones out” to the “microphone in” on my laptop and playing the recording from the NESTA site), so I just clipped the first 10 minutes (I think Youtube limits videos to 10 mins?) and saved the file as a wav file, then imported it into iMovie (thinking I might want to add some images, e.g. from photos of the event on flickr). This crib – iMovie Settings for Upload to YouTube – gave me the settings I needed to export the audio/video from my old copy of iMovie to a file format I could upload to Youtube (I think more recent versions of iMovie support a “Share to Youtube” option?).
I then uploaded this file, along with the subtitles file:
So there we have it: Twitter subtitle/annotations (pulled from a Twitter search feed) to the first part of Lord Carter’s presentation at Delivering Digital Britain…
PS Also on the Twitter front, O’Reilly have started watching Twitter for links to interesting stories, or into particular debates: Twitscan: The Debate over “Open Core”.
Chatting to @cheslincoln the other night, we got into a discussion about whether or not Twitter could be used to support a meaningful discussion or conversation, given the immediacy/short lived nature of tweets and the limited character count. I argued that by linking out to posts to support claims in tweets, “hyper-discussions” were possible. By mining “attention trends” (a term I got from misreading a tweet of Paul Walk’s that scaffold a conversation, it’s possible to create a summary post of a conversation, or argument, like the O’Reilly one?