Who’s Training Who?

Geek news hit the mainstream news earlier this week as a ‘pootah beat a human in the ancient Chinese game of Go. The company behind the programme, London based DeepMind Technologies, was bought by Google a couple of years ago. (By the by, another Google AI acquisition, Jetpac, call their image extraction app technology DeepBelief. Google’s Deepmind project also includes talent hires from Dark Blue Labs (natural language understanding).)

One of the problems of developing learning systems is getting the training schedule right. The principle behind many self-learning systems is that you present them with a set of inputs and a desired output, and then let them figure out a path from the input to the output. One way of doing this is to present the system with an input, compare its output to the desired one, and then punish or reward it according to how well the input matches the desired output. Then you tell it to try again. And again. And again.

Generating an open ended set of training sets used to be problematic, but Google has billions of human information processes, also known as human users, that it can call on to help train its machines. It does this directly, as well as via third party services who willingly serve you up to Google as a training signal generator on your way to logging into their services. One of the tools it uses is called reCAPTCHA, originally acquired by Google along with its precursor ESP Game (which became the Google Image Labeler for a bit?) from Luis von Ahn something like a decade ago; (see for example OpenLearn – Human Assisted Computing: Putting CAPTCHAs to Work and Games With a Purpose).

Anyway, here’s an example of reCAPTCHA in its current form:

ReCAPTCHA_demo

You can try it out here.

(Did you see what happened there? Google won again…)

PS In other news, there seems to be some Google robot wars going on… Bloomberg: Google Puts Boston Dynamics Up for Sale in Robotics Retreat.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: