Appropriate/ing Knowledge and Belief Tools?

A few weeks ago, I got my invite to the OpenAI GPT-3 beta (Terms of Engagement With the OpenAI API). You know the one: give it a start phrase and it will then start to generate follow on text for you. The phrase can be anything:

It was a cold, grey, morning. I picked up the chain saw and headed off into the town.

Wasn’t long before I was joined by a couple of men I knew from the pub. We’d worked together for a few years before my accident, and they were just pleased to see me back at work.
It was enjoyable, once I’d got used to the hard physical work. I knew I was in good shape, and the cold air made me feel even better.
My arms and legs were a little stiff, but nothing that wouldn’t soon disappear. I’d stopped taking painkillers a couple of months after getting the cast off, and hadn’t had any problems since.
It had been almost three months since I’d been at work. It was good to be back.
We were working on the new school, which had been burnt down a few months before. It was almost complete, and all that was left was to do the roofing and masonry work.
The owner of the building company was a nice man, with a sense of humour. He placed a ladder against the outside wall for us to stand on while we worked on the roof. I knew it was safe as long as he was holding it.
We had almost finished, when we heard shouting from the ground. We turned around to see the boss waving his arms, and we realised that the ladder had slipped from under one of the men. He was dangling from the top rung
OpenAI machine generated text

Or from a slightly different start:

It was a cold, grey, morning. I mopped up the blood, picked up the chain saw, and headed off into the town.

My face and head were covered in blood, so I had to drive with the window down. The blood was beginning to dry and flake. I didn’t want anyone to see me, so I didn’t stop.
I pulled into the parking lot of a 7-Eleven. I went inside, pulled out a newspaper, and sat down at a table by the window.
“Joe,” the guy behind the counter called out.
“Damn…that’s me,” I thought. I looked up, and was relieved it was just him. I gave him a nod.
“Hey, Joe, you okay?”
There was no point in lying to him.
“No. Not really,” I said.
“Uh, what happened?”
I’d been here before. If I had a different face, that would have been the end of it. But I had blood on my face.
“You need to go to the hospital?”
He was getting nervous.
I shook my head and said, “No. It’s nothing. Just a fight with my wife.”
He smiled, relieved. He went back behind the counter. I continued reading the paper. A few minutes later, he came back over and stood by my table
OpenAI machine generated text

(Hmmm… product placement opportunities to help pay for these services and keep then “free”?!)

Now, the above may or may not be “plagiarised” text, lifted directly from a single source or recombined from multiple trash fiction sources. The second piece is perhaps interesing because it makes use of dialogue. I can’t help but wonder what the surrealists would have made of this, how they would have seen this as a mechanical riff on both the exquisite corpse and automatic writing, an automated “fetch the next word” algorithm built on top of machine based hallucination engine wrapping the GPT-3 model constructed from an exquisite and almost unimagible, exquisite even, text corpus. (Yes, it was me that wrote that; not the machine.)

So the attribution of the previous text may or may not be suspect. The text may or may not be original. We have no way of knowing.

But consider the following example, a declarative rephrasing of a simple recall question — what are Asimov’s Three laws of Robotics?— that we have asked year on year in an introductory module:

Asimov’s Three Laws of Robotics are:

The first law: A robot may not injure a human being or, through inaction, allow a human being to come to harm. The second law: A robot must obey the orders given it by human beings except where such orders would conflict with the First Law. The third law: A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
The laws are implied in many of Asimov’s Robot books, but they are first stated explicitly in The Robots of Dawn. Other issues are also raised in the series. In later books, Asimov added a “zeroth law” of robotics: “A robot may not harm humanity, or, by inaction, allow humanity to come to harm”; this was introduced in Robots and Empire and again in The Caves of Steel.
The first three laws are similar to “The Three Laws of Eddorian Robotics” appearing in Eddore from the Lensman series by E. E. Smith. However, Smith’s laws are not a fundamental part of his future history, but are instead a throwaway item. Smith stated that his laws were merely a literary device to be used in his stories, and that they do not reflect any deep philosophical idea.

(More product placment?!)

In this case, it seems as if we get a factful respone. Compare, for example, the result of typing the interrogative form of the statement into the Google web search engine:

In this case, we get a similiar response, but with some form of attribution.

So, OpenAI can perhaps be used as a knowledge engine. But how trustworthy is it?

In formal logics of knowledge and belief, a distinction is often made between a belief statement that the system holds and uses to reason about the world and a knoweldge statement, a statement that the system holds and uses to reason about the world that is also true in and of the world.

In some respects, we might see Google as attempting to perform as a knowledge engine, returning facts that are true, and OpenAI as a belief engine, freewheeling a response based on what it’s just heard and what it’s heard others say before.

Let’s consider another example, the start of a declarative statement that we might be offered in response to an assessment questions aked in a simple databases module, for example:

The main differences between relational databases and document databases are

that document databases are:
Not relational. This means that relationships between data entities are not enforced in a strict sense. Document databases are designed to store and retrieve individual documents, not rows in a table.
Non-normalized. Document databases do not store data in multiple tables that are related by key fields. Instead, all the data for a given entity is stored in a single document.
In addition, there is no enforced integrity or consistency between entities in a document database.
Not ACID-compliant. ACID stands for Atomicity, Consistency, Isolation, Durability. These are the four basic properties of a transaction and are used to guarantee data integrity in a database.
Because document databases do not enforce relationships between entities, they do not meet the ACID test and thus are not transactional.
However, document databases address the same three basic CRUD (Create, Read, Update, Delete) scenarios as other non-relational databases.
They can also be queried and filtered using SQL queries and can scale to hundreds of nodes with a single database cluster.
Also, they can be hosted in a variety of cloud environments, including Amazon Web Services (AWS).
For more on why I refer to document databases as NoSQL databases, see this page on Wikipedia.
OpenAI machine generated text from initial prompt

(And even more product placement?! Is this gonna be a whole new SEO style battleground?!)

So… we get an answer, of a sort. (Computing academics are welcome to score the above answer out of 5 and submit marks as comments to the is post…). One think I note in particular is the reference (in text) to Wikipedia, but no link or page reference. At that point in the response, I had limited to the engine to produce a certain number of words, so who knows whether a link or page reference would have been generated as the next item. (I should have let it play on, I guess…)

One might also wonder what other Coleridgian interruptions our automatic writing engine might experience…?

It’s not just text that the models will generate. A recent announcement from Github (owned by Microsoft) and OpenAI introduces Github Copilot, “a new AI pair programmer that helps you write better code” which claims to be able to let you “write a comment describing the logic you want, and let GitHub Copilot assemble the code for you”, “let GitHub Copilot suggest tests that match your implementation code”, and let “GitHub Copilot show you a list of solutions [so you can] evaluate a few different approaches”.

In passing, I note an interesting UI feature in highlighting the the latter example, anudge, literally: a not-button is nudged, enticing you to click it, and if you do, you’re presented with another example:

The code as it currently stands is based on a model trained from submissions to Github. My immediate thought was: is it possible to licence code in a way that forbids its inclusion in machine learning/AI training sets (or will it be a condition of use of Github that public code repos at least must hand over the right for the code, and diffs, and commit comments to be used for machine training?). Another observation I saw several folk make on the Twitterz was whether we’ll start t see folk deliberately putting bad code or exploit code into Github in an attempt to try to pollute the model. As a quality check, I wondered what would happen if every Stack Overflow were provided with a machine generated answer based on OpenAI generated text and Copilot generated code and then used upvotes and downvotes as a error/training signal. Then @ultrazool/Jopointed out that a training signal can already be generated from suggested code that later appears in a git commit, presumably as a vote of confidence. We are so f****d.

It’s also interesting to ponder how this fits into higher education. In the maths and sciences, there are a wide range of tools that support productivity and correctness. If you want a solution to, or the steps in a proof of, a mathematical or engineering equation, Wolfram Alpha will do it for you. Now, it seems, if you want an answer to a simple code question, Copilot will offer a range of solution for your delectation and delight.

At this point, it’s maybe worth noting that code reuse is an essential part of coding practice, reusing code fragments you have found useful (and perhaps then adding them to the language in the form of code packages on PyPi), as for example described in this 2020 arXiv preprint on Code Duplication and Reuse in Jupyter Notebooks.

So when it comes to assessment, what are we to do: should we create assessments that allow learners to use knowledge and productivity tools, or should we be constraining them to do their own work, ex- of using mechanincal (Worlfram Alpha?) or statistical-mechanical (OpenAI) support tools for the contemporary knowledge worker?

PLAGIARISM WARNING – the use of assessment help services and websites
The work that you submit for any assessment/exam on any module should be your own. Submitting work produced by or with another person, or a web service or an automated system, as if it is your own is cheating. It is strictly forbidden by the University.
You should not:
– provide any assessment question to a website, online service, social media platform or any individual or organisation, as this is an infringement of copyright.
– request answers or solutions to an assessment question on any website, via an online service or social media platform, or from any individual or organisation. use an automated system (other than one prescribed by the module) to obtain answers or solutions to an assessment question and submit the output as your own work.
– discuss exam questions with any other person, including your tutor. The University actively monitors websites, online services and social media platforms for answers and solutions to assessment questions, and for assessment questions posted by students.
A student who is found to have posted a question or answer to a website, online service or social media platform and/or to have used any resulting, or otherwise obtained, output as if it is their own work has committed a disciplinary offence under Section SD 1.2 of our Code of Practice for Student Discipline. This means the academic reputation and integrity of the University has been undermined.

And when it comes to the tools, how should we view things like OpenAI and and Copilot? Should we regard them belief engines, rather than knowledge engines, and if so how should we then interact with them? Should we be starting to familiarise ourselves with the techniques descriebed in Automatic Detection of Machine Generated Text: A Critical Survey, or is that being unnecessarily prejudiced against the machine?

In skimming the OpenAI docs [Answer questions guide], one of the ways of using OpenAI is as “a dedicated question-answering endpoint useful for applications that require high accuracy text generations based on sources of truth like company documentation and knowledge bases”. The “knowledge” is provided as “additional context” uploaded via additional documents that can be used to top up the model. The following code fragment jumped out at me though:

{"text": "puppy A is happy", "metadata": "emotional state of puppy A"}
{"text": "puppy B is sad", "metadata": "emotional state of puppy B"}

The data is not added as a structured data object, such as `{subject: A, type: puppy, state:happy}`, it is added as a text sentence.

Anyone who has looked at problem solving strategies as a general approach in any domain is probably with familiar with the idea that the way you represent a problem can make it easier (or harder) to solve. In many computing (and data) tasks, solutions are often easier if you represent them in a very particular, structured way. The semantics are essentially mapped to syntax, so if you get the syntax right, the semantics follow. But here we have an example of taking structured data and mapping it into natural language, where it is presumably added to the model but with added weight to be applied in recall?

This puts me in mind of a couple of other things:

a presentation by a somone from Narrative Science or Automated Insights (I forget which) many years ago, commenting on how one use for data-to-text engines was to generate text sentences from every row of data in a database so that it could then be searched for using a normal text search engine, rather than having to write a database query;
the use of image based representations in a lot of a machine learning applications. For example, if you want to analyse an audio waveform, whose raw natural representation is a set of time ordered amplitude values, one way of presenting it to a machine learning system is to re-present it as a spectrogram, a two dimensional image with time along the x-axis and a depiction of the power of each frequency component along the y-axis.

It seems as if everyday we are moving away from mechanical algorithms to statistical-mechanical algorithms — AI systems are stereotype engines, with prejudiced beliefs based on the biased data they are trained on — embedded in rigid mechanical processes (the computer says: “no”). So. Completely. F****d.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering... View all posts by Tony Hirst