Twitter Auto-translation Pipe

Popping into my Twitter feed yesterday was a reference from a hack day backchannel to a Twitter map pipe I use a lot in demos (see also Demonstrating Twitter in Conference Presentations). The tweet was tagged #brhackday, so of course I followed it, and then got stuck…

That’ll be br for Brazil then, I guess?

Anyway, driving home last night I remembered I’d messed around with a couple of language related pipes before (e.g. Filter Tweets by Language, so here’s one that does a bit of automagical translation:

We start of by reusing a couple of pipes – one to gran a twitter search feed given a user supplied search term, the other to autodetect the language using the Google Language detector API (as described in the post mentioned above).

THe next step is to split the tweets based on language – if they are already in the language we want them translated to, we don’t need to do any translation… For the tweets we do need to translate, we define the language pair (fromLanguage|toLanguage). The fromLanguage is provided by the language autodetector, the toLanguage is provided by the user.

The next step is to construct a URL that will call the Google language translation API again, this time with the text that needs to be translated along with the language mapping. (It may be that the API can do a language autodetect and then automagically handle the translation – but I thought it was worth unpicking the process in case you wanted to plug in a different language translation service, for example).

Finally, we merge the untranslated and translated streams, and sort the feed in reverse chronological time to make it a little bit more conventional:

So there you have it – an automagic twitter translator:


PS bah – pipe described above also needs a user input box for the twitter search term… oops!