Fragment — Student Led Socratic Dialogues With an Unreliable ChatGPT

In Can We Get ChatGPT to Act Like a Relational Database And Respond to SQL Queries on Provided Datasets and pandas dataframes? I dabbled with coaching ChatGPT to query a simple dataset I have provided using SQL. If you use enough baby steps, I suspect you could push this quite a long way. However, if the queries you ask require ChatGPT to imagine implementing multiple steps in one go (for example, counting the number of items per group in a particular dataset) it can hallucinate an incorrect answer, albeit of the correct form (for example, incorrect item counts in each group).

If we assume that ChatGPT is well intentioned, and does know or can figure out the right answer if we help it structure its reasoning using baby steps, and we also recognise that it’s often casual, sloppy or slapdash in its thinking when we ask it a question in general — that is, if we assume it is knowledgeable but unreliable, or knoweldgeable but can’t remember how to get to a correct answer and so takes a leap of faith and guesses at it — we can coach it towards providing correct answers.

In the Socratic dialogue method of education, a knowledgeable tutor who claims to profess ignorance of a topic asks structured questions of the learner. The answers to the questions lead the learner down a self-reflective reasoning path that is carefully constructed by the questioner. The learner appears to come to a reasonable solution by themselves through the answers they provide to the questions that are apparently asked from a position of ignorance. Or something like that.

We can use the approach as independent learners if we learn how to ask good questions. Children don’t need to learn how to ask questions like this: they are naturally curious. I suspect a lot of higher education educators, which is to say, expert learners who share learning journeys with their students, are people who have managed to resit the numbing effects of formal education and retain their natural curiousity and are happy to share the sorts of questioning strategies they use.

ChatGPT is great at answering questions, although sometimes the leaps of imagination are too great. It sees how to get from here, to there, -ish, and then takes a leap to there that would be correct if it did the intermediate steps correctly, but it doesn’t – it skips the detailed working, and in doing so perhaps makes a few mistakes in getting to there from here.

So it’s unreliable.

So how about we put the actual student in the role of the questioner, and ChatGPT in to the role of the as-if student, and get the actual student to ask questions of the as-if student until the (ignorant) student is happy with the answers provided by the as-if student and is happy that the as-if student appears to have shared some sort of reasonable understanding of the problem.

If we take the role of the supposedly ignorant questioner (whether we are ignorant or not does not matter), we can ask questions of ChatGPT and get it to provide answers to our questions. But suppose we are suspicious of ChatGPT, and require of ourselves that we check the answer so that we are happy it is correct not just becuae we are happy with the answer as provided, but also with the way the answer was arrived at. With the simple SQL group counting problem, it was easy enough to manually count the membership of some of the groups and check the counts, noting that the actual count differed from the count given by ChatGPT. In order to ascertain how ChatGPT might have arrived at the answer, we could then ask it a simpler question, to identify the members of one of the groups that it had (mis)counted; once again, it may correctly or incorrectly identify those. If we weren’t satisfied with that answer, we could could step back again, asking it to enumerate all the data items in the original data and keep a running tally of the number in each group. One way of satisfying ourselves that we are happy with the answer provided is to deny it. Simply by prompting something like “I think you are wrong” or “are you sure?” and then “I think X is not…” often seems to get ChatGPT to take its answer and pick it apart to see if it can reason it’s way from here to there by one or more intermediate steps, steps it sometimes then provides in its follow up answer.

At each step, by being suspicious of the answer, we can also try to imagine decomposing the step just taken into one or more simpler steps. We can then get ChatGPT to demonstrate those steps. We can keep rolling back these steps, making tinier and tinier baby steps each one attempting to simplify the current step into smaller steps, until we are satisfied that a baby step is performed correctly; and then we can use the (correct) result of that step as the basis for performing the next step. Again, if we are not satisfied with the next answer, we need to prompt until we get to a step that is performed correctly from the result of the step that we previously satisfied ourselves with being correct.

In this way, even if ChatGPT is unreliable in taking big steps, and we are ignorant in the topic, we can challenge each answer and prompt for the step to be broken into simpler parts so we can check the “inner working” of each step.

This process may seem like a quite laboured one, and it is. Indeed, it’s one of the reasons why many people are wary of ChatGPT. A lot of the answers it produces may appear reasonable on a quick reading, but they actually contain errors, inconsistencies, or things that aren’t quite right, errors that then get propagated. (If you think of ChatGPT as predicting the next word it produces based on the the likelihood of that word appearing given the previous words, you can start to see how errors earlier in the argument are likely to get propagated as the answer contiues to unfold.) To identify these errors requires a close reading. So editing the answers generated by ChatGPT can take a lot of work.

But doing the work is what learning is about: coming to an understanding.

Even if ChatGPT is unreliable, even if we are initially ignorant of a topic, we can still engage it in a dialogue where we prompt it to break down an argument into steps that we understand and are happy to agree make sense.

When using ChatGPT to help us generate an answer to a question, we need to be happy that we understand each step. This still doesn’t necessarily help in finding ways of writing “cheat-proof” assignments, but does perhaps give us a way of thinking about how things like ChatGTP might be useful for supporting learning that makes a strength of its unreliability.

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

3 thoughts on “Fragment — Student Led Socratic Dialogues With an Unreliable ChatGPT”

  1. Thank you Tony for maybe the only reasonable “take” on ChatGPT I’ve seen all week. Like you I have doubts how many would design this kind of approach.

    It flips the assumptions it’s hyped head— instead of buying into the way it’s pitched that this is even intelligence and is “right” start from a position it is wrong and unravel it. Not only break down its logic but present counter ones and cite evidence.

    Thanks for this.

    1. I think the “unreliable tutor” notion, as well as the “dodgy question, dodgy answer” dialogue are both worth exploring, particularly in a world where found or generated answers may be suspect.

      Using these tools iteratively has a risk of taking you down a path of increasing nonsense, of course, but that’s where you need to build negative feedback (in a cybernetics / control sense) into the process.

Comments are closed.

%d bloggers like this: