Twitter Auto-translation Pipe

Popping into my Twitter feed yesterday was a reference from a hack day backchannel to a Twitter map pipe I use a lot in demos (see also Demonstrating Twitter in Conference Presentations). The tweet was tagged #brhackday, so of course I followed it, and then got stuck…

That’ll be br for Brazil then, I guess?

Anyway, driving home last night I remembered I’d messed around with a couple of language related pipes before (e.g. Filter Tweets by Language, so here’s one that does a bit of automagical translation:

We start of by reusing a couple of pipes – one to gran a twitter search feed given a user supplied search term, the other to autodetect the language using the Google Language detector API (as described in the post mentioned above).

THe next step is to split the tweets based on language – if they are already in the language we want them translated to, we don’t need to do any translation… For the tweets we do need to translate, we define the language pair (fromLanguage|toLanguage). The fromLanguage is provided by the language autodetector, the toLanguage is provided by the user.

The next step is to construct a URL that will call the Google language translation API again, this time with the text that needs to be translated along with the language mapping. (It may be that the API can do a language autodetect and then automagically handle the translation – but I thought it was worth unpicking the process in case you wanted to plug in a different language translation service, for example).

Finally, we merge the untranslated and translated streams, and sort the feed in reverse chronological time to make it a little bit more conventional:

So there you have it – an automagic twitter translator:

PS bah – pipe described above also needs a user input box for the twitter search term… oops!

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

5 thoughts on “Twitter Auto-translation Pipe”

  1. … and you could automatically retweet tweets in the chosen language via twitterfeed or similar (the scenario being if you are running a international event you could use dedicated twitter account(s) to rebroadcast tagged tweets i.e. you could even use multiple languages having feeds for EN, FR etc and appending individual tweets:

    EN: blah, blah blah

    FR: le’blah, le’blah, le’blah

    Very nice!

  2. Fortunately on the client I have (JournoTwit) there is an option for “Automatically translate tweets using Google Translate?” – very handy! I don’t know how many other clients have this built in though.

    1. I don’t need it built in – I’ve got a general purpose tool that will do it for me; and I built it largely myself… ;-) heh heh

  3. What if I have to detect language on the twitter firehose? You can’t make a http call in such scenarios.

Comments are closed.

%d bloggers like this: