Algorithmic Truthiness

With a media who failed to hold jokers to account when they had their chance, preferring “balanced” reporting that biases news reports and gives equal measure to unequally validated ideas, and social media opting for truthiness rather than fact to generate momentum for spreading (fake) news, it seems we’re told by commentators we’re now in a “post-truth”/”post-factual” world.

As the OED define it, truthiness is The quality of seeming or being felt to be true, even if not necessarily true.

Although the definition could be debated…


Sound familiar?

A few years ago, at the dawn of the age of Big Data, the idea that segmenting and modelling large datasets in a “theory-free” way (Big data and the end of theory?) perhaps gave an inkling that truthiness was on its way in, big time. (Compare this also with anti-expert rhetoric over the last couple of years. I’m all for slamming certain classes of academic outlook and activity, but I also think there are reasons for trusting certain sorts of claims more than others…)

The fact that data processing algorithms are likely to have ever increasing power of what we read – not only in terms of selecting which stories to show us in our personalised news feeds, but also because other machines may themselves have written the stories we’re reading – means that we need to start getting a feel for what sorts of biases are likely to be baked into these algorithms.

In contrast to earlier generation of rile based expert systems that could be asked to “explain” their reasoning, today’s systems are often black box statistical machines. Whereas rule based systems used logical reasoning to come up with answers, Deep Learning algorithms and their ilk have gut reactions: rule based expert systems reasoned towards a truth associated with the logical statements asserted into them in an explainable way; black boxes have gut reactions and deal in truthiness.

But whereas we might be suspicious about a person making a truthy claim (“that doesn’t sound quite right to me…”) once we start to trust machine – because they appear to be right-ish, most of the time – we start to over-trust them. I think – I haven’t checked. Sounds truthy to me…

So with a tech news report doing the rounds at the moment that a “Neural Network Learns to Identify Criminals by Their Faces”, it seems that the paper authors “have demonstrated that via supervised machine learning, data-driven face classifiers are able to make reliable inference on criminality” as well as identifying “a law of normality for faces of noncriminals. After controlled for race, gender and age, the general law-biding public have facial appearances that vary in a significantly lesser degree than criminals”. (It’s not hard to imagine this being used a ranking factor for something…) The (best) false positive rate looked on one of the charts (figure 4 in the paper) to be around 6%. Are the decisions “true”, then, or just “truthy”? What level of false positivity makes the difference? (Bear in mind behaviourist training  – partial reinforcement can be really powerful…) I also wonder if the researchers ran the same training schedule against IQ? Or etc etc

(In passing, another recent preprint report on arXiv – Lip Reading Sentences in the Wild reports on an automated lip reading system trained on several hours of people talking on BBC television (the UK based researchers were license fee payers, I suspect, but the Google Deepmind sponsor..?!) (If you’d rather read a pop sci write up, New Scientist has one here: Google’s DeepMind AI can lip-read TV shows better than a pro.) For reference, the best word error rate the researchers report is 3.3%. So are the outputs true or truthy?)

So… I’m wondering… algorithmic truthiness: the extent to which the outputs of an algorithm feel as if they could be true, even if not necessarily true. … a useful conceit, or not?

Or maybe we need an alt definition, such as “The extent to which you believe the output of an algorithm to be true rather than what you know to be true”?!


  1. Ben

    All my algorithms have high truthiness. They all seem or feel correct at time I write them. Unfortunately my users sometimes disagree due the quantity of bugginess.