ChatGPT Rehash

A lot of things have been written and shared about ChatGPT over the last few days, and I’m wary of just rehashing and resharing the stuff that is already out there in rehash round-up post. But there’s a handful of markers and observations that I want to make a note of that have seemed curious to me, or that I can’t really grok yet (i.e. I’m not sure what the consequences are).

First up, a couple of my own observations. If you’ve played with the ChatGPT free research preview (“official” background info) you’ll have noticed a few obvious things: it generates responses quickly, it is stateful (that is, it can refer to previous things in the conversation), the responses are plausible looking, the responses often include errors of at least two sorts (firstly, the response may be factually wrong; secondly, the response may be internally inconsistent; as an example of the latter I asked ChatGPT to generate a marking scheme out of 10 marks and the marks added up to 9 marks; when I asked it to revise the scheme to be out of 10, then next attempt got to 9.5. before I finally found a way to revise the scheme so that the marks a added up to 10).

ChatGPT is very sensitive to the prompts you give it: phrase something one way and it tells you it’s just a large language model (LLM) trained by OpenAI and that it can’t help, slightly tweak the prompt and and it can do what you asked. If you just reload the same prompt that didn’t work before in a fresh session, it sometimes now does work, so there’s some element of randomness in there too. assuming that that there isn’t some change in the model or invisible starting state between trials.

If you tell it it’s wrong (even if it isn’t), there seem to be several possible responses: a common one is that it apologises (“I apologize if my response was not accurate.”); on some occasions, it then agrees with you that it was either incorrect or might have caused confusion, and then often attemps to revise its answer; on other occasions, it might attempt to defend its position.

One thing you can do in the chat UI is edit a prompt and then resave it. This deletes all the downstream content and that previous content appears to be forgotten (so the statefulness is presumably all the content of the session above). For example:

If we now edit and save the prompt in the first line, the downstream content is removed and new answer generated that “forgets” the original assignment:

Something that I haven’t been able to recreate, and that I can’t confirm (note to self: screenshot every tansaction before the next…), but that is very concerning in terms of UI design, is that ChatGPT seems to be able to claim that it can edit its earlier answers….

I haven’t really explored prompt refinement in much detail, but I iterated on a naive prompt once in reply to a colleague who seems dismissive of the whole approach in the ability of ChatGPT to generate interesting questions to get chatGPT to generate a question type that included misapprehensions about a topic that the student should address in their answer:

Please create an example assessment exercise for a 10th grade computing and IT assessment in which a student should create a blog post in a layperson’s terms addressing common misunderstandings about computability and the limits of computation, including key concepts (give a list of three or four example relevant concepts). Include three or four examples of statements that demonstrate the sort of misunderstanding about computability and the limits of computation that a 10th grader might have. The question should be appropriate for somehow who has a general idea of what a Turing machine is but not a theoretical computer science understanding of it. Add a sensible maximum word count to the end of the exercise. Then provide an example marking guide out of 10 marks.

My prompt to ChatGPT

For me, it’s not so much what ChatGPT produces but the process by which you get it to produce things and develop your own ideas: this starts with how you frame your initial prompt, and although you need to be suspicious about what ChatGPT produces in response, you can still use it to explore your own understanding, not least through refining your prompts in order to get ChatGPT to either refine or re-present its previous offerings, or generate a new response as a further provocation to you.

Also as far as prompt strategies go, the following three step strategy may be useful if you need to persuade ChatGPT to provide a particular response that it was reluctant to provide if asked straight out.

In the second step, we are essentially getting ChatGPT to create its own prompt.

Currently, it is possible to get ChatGPT to produce content that triggers a content warning.

In passing, here are some other things I’ve noticed other folk talking about:

Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.

The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce.

  • Simon Willison is using AI tools to help him learn Rust, using Advent of Code challenges to motivate daily activities. See Learning Rust with ChatGPT, Copilot and Advent of Code and the associated GitHub issue tracker being used as a learning diary. This is the sort of innovation in teaching and learning that I don’t think we are doing internally and should be…
  • this Twitter thread by @GuyP has some interesting examples of using ChatGPT to generate prompts for text2image AI services. So we can apparently bootstrap a generative AI pipeline from a generative AI… (Note: I am taking these on trust – they could have been faked…)
  • in another Twitter post, @StructStories hints at how we might be able to generate structured data story templates:

I found that VM example interesting from a “role play” perspective. Trying to design engaging activities can often be a time consuming affair, particularly if you are trying to work out what steps are required to make an activity work. In the same way that web designers (used to?) use wireframes to mock up web UIs rather than writing HTML with nothing behind it, we might be able to quickly role play various activity set-ups using ChatGPT to get a feeling of what particular interactions might be like and what sort of output they might present in a risk-free, ersatz sandbox…

  • writing on his Stratechery blog — AI Homework — Ben Thompson suggests providing kids with “Zero Trust Homework”, where “instead of futilely demanding that students write essays themselves, teachers insist on [students generaing essays using] AI”. Because the AI is unreliable, it’s down to the student to verify the answers and identify and correct any errors. I’m increasingly of the mind that an equivalent of a “calculator paper” could be interesting for assessment, where the questions are such that a student needs to use AI tools for solve a provided in a particular amount of time, but also where those tools are unreliable and that you are actually assessing both prompt design and verification/editorial skills.

PS another way in which ChatGPT can be used as a playground: inventing a language (via @simonw).

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

%d bloggers like this: