Warning — May Contain Traces of AI

A recent flurry of announcements by Google demonstrate how tensorflow co-processors and statistical models, rather than rule based ones, may soon be coming to a device near you.

Getting on for three years ago, Google announced they had developed a Tensorflow Processing Unit, a co-processor designed to speed up the training of deep-learning models. A year later — so two years ago — they announced cloud availability of TPUs, along with an “in-depth look at Google’s first Tensor Processing Unit (TPU)”.

You can check out TPU availability on Google Cloud services here. Since September 2018 (?), access to limited free TPU support has been available via Google Colab. A minimal ‘get started’ notebook can be found here.

The next step, recently announced among a flurry of announcements at the Tensorflow Developer Summit, 2019 (review), is to provide TPUs you can run at home: Coral Edge TPU Devices. These come in a couple of flavours:

  • Coral Dev-Board, a wireless development board with 1GB of RAM and 8GB of Flash memory, micro-SD slot, gigabit Ethernet, audio jack, and HDMI connectors, dual microphone, and onboard CPU, GPU  and “ML [machine learning] accelerator” Google Edge TPU coprocessor; (seems like a Raspberry Pi on steroids?)
  • USB accelerator, a TPU on a stick that you can plug into your Linux laptop or Raspberry Pi to give it a bit of extra oomph…

Some other things to watch out for…

Pete Warden reports how tensorflow models may soo be coming to a micro-controller near you: Launching TensorFlow Lite for Microcontrollers (repo); ported versions for several microcontrollers are already availaible. It seems he gave a demo of a microcontroller responding to a particular voice activation command:

So why is this useful? First, this is running entirely locally on the embedded chip, with no need to have any internet connectivity, so it’s good to have as part of a voice interface system. The model itself takes up less than 20KB of Flash storage space, the footprint of the TensorFlow Lite code is only another 25KB of Flash, and it only needs 30KB of RAM to operate.

Lest you think this is just in the realm of demoware, Google are also releasing an all-neural on-device speech recognizer:

… a model trained using RNN [recurrent neural network] transducer (RNN-T) technology that is compact enough to reside on a phone. This means no more network latency or spottiness — the new recognizer is always available, even when you are offline. The model works at the character level, so that as you speak, it outputs words character-by-character, just as if someone was typing out what you say in real-time, and exactly as you’d expect from a keyboard dictation system.

Just reflect on that naming for a moment: Recurrent Neural Network Transducer. I normally thing of transducers as physical sensors (eg things that continuously convert sound, or light, or pressure, or temperature to an electrical signal). Here, we have the notion of a software transducer that turns a signal into a set of meaningful symbols in a real-time conversion stream:

RNN-Ts are a form of sequence-to-sequence models that do not employ attention mechanisms. Unlike most sequence-to-sequence models, which typically need to process the entire input sequence (the waveform in our case) to produce an output (the sentence), the RNN-T continuously processes input samples and streams output symbols, a property that is welcome for speech dictation. In our implementation, the output symbols are the characters of the alphabet. The RNN-T recognizer outputs characters one-by-one, as you speak, with white spaces where appropriate. It does this with a feedback loop that feeds symbols predicted by the model back into it to predict the next symbols…

We can haz all ur devices R listen 4 uz…

By the by, I note that the Tensorflow Hub (about) provides a range of (partial) models to build from / retrain in your own solution. Amazon Sagemaker also offers pretrained ML models in their AWS Sagemaker Marketplace. At the moment, I don’t think any of these come with health warnings along the lines of may contain bias or bias inside… Which they should…

However, tools for helping probe the various levels of feature detection embedded within a network are starting to appear. For example, Google announced a technique they’re calling activation atlases:

Activation atlases provide a new way to peer into convolutional vision networks, giving a global, hierarchical, and human-interpretable overview of concepts within the hidden layers of a network. We think of activation atlases as revealing a machine-learned alphabet for images — an array of simple, atomic concepts that are combined and recombined to form much more complex visual ideas.

An example is given of activation atlases for a convolutional image classification network:

In general, classification networks are shown an image and then asked to give that image a label from one of 1,000 predetermined classes — such as “carbonara“, “snorkel” or “frying pan“. … One neuron at one layer might respond positively to a dog’s ear, another at an earlier layer might respond to a high-contrast vertical line.

An activation atlas is built by collecting the internal activations from each of these layers of our neural network from one million images. These activations, represented by a complex set of high-dimensional vectors, is projected into useful 2D layouts …

[A]ll the activations are too many to consume at a glance [so] we draw a grid over the 2D layout we created. For each cell in our grid, we average all the activations that lie within the boundaries of that cell, and use feature visualization to create an iconic representation.

In certain respects, this reminds me a little bit of Andy Wuensche’s basins of attractions in discrete dynamical networks from way back when…

In that case, the idea was to try to represent how all possible states of a network were connected to see where any given initial state might lead a network to and then find a way to meaningfully visualise that. In this case, it seems that the idea is to to try to identify what features and given node might be sensitive to (i.e. plot all the grandmother cells (lite background)).

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: