More Conversations With ChatGPT About Pasted in Data

in Templated Text Summaries From Data Using ChatGPT I have a quick play seeing if ChatGPT could interpret a CSV data set as data and then generate some simple templated text reports around it.

Grabbing a coffee for another play, this surprised me. In a new thread, I opened with:

Treat the following as a tab separated dataset

Bahrain	20 Mar 2022	Charles Leclerc	FERRARI	57	1:37:33.584
Saudi Arabia	27 Mar 2022	Max Verstappen	RED BULL RACING RBPT	50	1:24:19.293
Australia	10 Apr 2022	Charles Leclerc	FERRARI	58	1:27:46.548
Emilia Romagna	24 Apr 2022	Max Verstappen	RED BULL RACING RBPT	63	1:32:07.986
Miami	08 May 2022	Max Verstappen	RED BULL RACING RBPT	57	1:34:24.258
Spain	22 May 2022	Max Verstappen	RED BULL RACING RBPT	66	1:37:20.475
Monaco	29 May 2022	Sergio Perez	RED BULL RACING RBPT	64	1:56:30.265
Azerbaijan	12 Jun 2022	Max Verstappen	RED BULL RACING RBPT	51	1:34:05.941
Canada	19 Jun 2022	Max Verstappen	RED BULL RACING RBPT	70	1:36:21.757
Great Britain	03 Jul 2022	Carlos Sainz	FERRARI	52	2:17:50.311
Austria	10 Jul 2022	Charles Leclerc	FERRARI	71	1:24:24.312
France	24 Jul 2022	Max Verstappen	RED BULL RACING RBPT	53	1:30:02.112
Hungary	31 Jul 2022	Max Verstappen	RED BULL RACING RBPT	70	1:39:35.912
Belgium	28 Aug 2022	Max Verstappen	RED BULL RACING RBPT	44	1:25:52.894
Netherlands	04 Sep 2022	Max Verstappen	RED BULL RACING RBPT	72	1:36:42.773
Italy	11 Sep 2022	Max Verstappen	RED BULL RACING RBPT	53	1:20:27.511
Singapore	02 Oct 2022	Sergio Perez	RED BULL RACING RBPT	59	2:02:20.238
Japan	09 Oct 2022	Max Verstappen	RED BULL RACING RBPT	28	3:01:44.004
United States	23 Oct 2022	Max Verstappen	RED BULL RACING RBPT	56	1:42:11.687
Mexico	30 Oct 2022	Max Verstappen	RED BULL RACING RBPT	71	1:38:36.729
Brazil	13 Nov 2022	George Russell	MERCEDES	71	1:38:34.044
Abu Dhabi	20 Nov 2022	Max Verstappen	RED BULL RACING RBPT	58	1:27:45.914

And got this quite surprising reply…

Without an explicit prompt, it seems to have interpreted the data, told me what it relates to, and provided a brief summary of some key featurs in the data. Recall also that the model does not have access to data from 2022, other that what I provided in the prompt.

(At this point, I wonder if I should have prompted ChatGPT to display the data as a tabular data set? Might that have helped its analysis?)

I then asked a very poor question:

Why are those other drivers notable?

Me to ChatGPT

(What I should have prompted was somthing more like: “explain why you said that Sergio Perez, Carlos Sainz and George Russell”.)

I tried to recover the initiative:

You said the drivers were notable. Why did you say that?

Mt to ChatGPT

So how good’s the counting…?

Which team was third in terms of numbers of race wins and how many wins did they get?

Me to ChatGPT

Not very good… it went downhill from there…

And then got worse…

And worse…

And worse…

And then it got to lunch time and ChatGPT lunched out…

PS example of rendering as a tabular data set…

My next prompt would have been something like “Each row in that data table corresponds to a race win. According to that data, how many race wins did Ferrari have?” but it just keeps timing out again…

PS In another session, I asked it to display the first, third and fourth columns as a tabular dataset in the style of a CSV file:

It struggles with telling me how many times Ferrari appears in the dataset, so I try to nudge it along the way of understanding…

Hmmm.. let’s see if we can help it a bit more…

Does that help?

What has it got against Ferrari having won in round 11 (Austria)?

As is stands, I don’t think we can trust it to interpret a dataset we have provided it with. Hmmm.. I wonder…

It was actually 17, but can we get ChatGPT to count the wins out a line at a time…

And when applied to the whole dataset?

So is that handy as a prompt in its own right? Maybe not, ChatGPT appears to prefer the the original CSV data set that it struggles to understand.

So what does it think is in the thirteenth row?

How does it count that?

Let’s try again…

Would it be more reliable if we addressed each row explicitly by a unique key value?

I wonder if we can also improve reliability by generating derived datasets, (like the templated output dataset) and then working with those derived datasets. This would be akin to setting up a data cleaning pipeline and then working with the cleaned data, though we would have to be careful to check the dataset was cleaned correctly, and that we were unambiguous in which dataset we wanted chatGPT to work with at any particular step.

PS to try to improve matters, I wondered: Can We Get ChatGPT to Act Like a Relational Database And Respond to SQL Queries on Provided Datasets and pandas dataframes?

Templated Text Summaries From Data Using ChatGPT

Back in the day, I used to tinker with various ways of generating text reports from datasets. Via my feeds yesterday, I noticed that the folks over at the ONS have been exploring automated report generation using recently released Census 2021 data. For example, the How your area has changed in 10 years: Census 2021 reports provide separately generated reports for each local authority area (ah, that takes me back…;-).

I haven’t played with that sort of thing over the last few years (the occasional dabble in the context of WRC rally reporting aside), but I wondered how easy it would be to hook into ChatGPT to generate some simple templated reports.

Interpret the follow as a tab separated CSV file:

Rank	County	Population	Region	Largest_settlement
1	Greater London	8,901,000	London	London
2	West Midlands	2,910,000	West Midlands	Birmingham
3	Greater Manchester	2,824,000	North West	Manchester
4	West Yorkshire	2,314,000	Yorkshire and the Humber	Leeds
5	Hampshire	1,852,000	South East	Southampton

When you read in the population values, remove the commas. For example, change 8,901,000 to 8901000

ChatGPT also provided some example code:

What would the CSV look like after removing the commas in the population column?

Me to ChatGPT

What is the population of the West Midlands according to that dataset?

Me to ChatGPT

Write a natural language generation template for that data that would produce a sentence that describes the population of a county and the largest settlement in it.

Me to ChatGPT

Apply the template to the row with Rank 2

Me to ChatGPT

Now for rank 4

Me to ChatGPT

On another attempt, it generated a slightly different template and did not suggest any code. I then asked it to apply the template to all the rows and generate a summary, which just a concatenation of the generated sentences. But it seems capable of doing a bit of reasoning too…

Start the summary paragraph with the phrase “The middle three counties in terms of population are”

Me to ChatGPT

And again…

Start the summary paragraph with the phrase “The middle three counties in terms of population are”. Then make a comment about the counties with the largest and smallest populations and identify their largest towns and donlt say any more.

Me to ChatGPT

So… can we trust it reason about small datasets and generate reports about them? How far does it scale in terms of the amount of data we can provide it?

Hmmm, I wonder… can it do joins across small datasets…?

PS After a great start on another data set, (see More Conversations With ChatGPT About Pasted in Data), ChatGPT then fell apart completely. Return to drawer labelled “random text generator” .

OpenRefine Style Data Wrangling Tool for VS Code?

I’ve been following tech for long enough to know that many of the shiny toys and tech tools reported in academic conferences never actually work on any body else’s machine, and that if they ever did the code has rotted in an unmaintained repo somewhere in the year between the submission of the paper and its actual publication.

Corps also tease product announcements, particularly in conference sessions, with releases due “any time now”, that get a bit of social media hype at the time (they used to get blog mentions…) but then never actually appear.

I’m hopeful that that the following VS Code extension will appear this side of the New Year, but the release schedule is “lag a month” rather than magazine style “lead a month” cover dates (I’m guessing the issue of Racecar Engineering Magazine that hit our letterbox a few days ago is the January, 2023, issue (maybe even February, 2023?!); by contrast, the November release of the VS Code Python and Jupyter extensions should probably hit any time now (second week of December)).

The extension is a “data wrangler” extension that looks like it will provide a lot of OpenRefine style functionality for cleaning and manipulating data in the browser. In OpenRefine, a browser based GUI interface can be used to wrangle a dataset and alos generate a replayable history file. The data wrangler extension also provides a GUI interface, but rather than a history file it generates pandas Python code to replay the manipulation steps.

I first caught sight of it mentioned in a Github Universe conference session (GitHub and VS Code tricks for data scientists – Universe 2022):

It was also demoed at EuroPython, 2022 (Python & Visual Studio Code – Revolutionizing the way you do data science – presented by Jeffrey Mew):

I’m wondering whether we should switch to this from OpenRefine. The issue then would be whether we should also switch to VS Code notebooks rather than our planned move to JupyterLab.

My gut feeling is that JupyterLab environent is preferable for presentational, rather than technical, reasons: specifically, we can brand it and we can customise the notebook rendering. The branding means that we can give students a sense of place when working in the computational environment we provide them with. They are not in a workplace coding environment, they are in a teacjing and learning environment, and the sort of code we might expect them to work with, and how we want them to work with it, may be slightly different than the sort of code they would be expected to work with in working environment.

The presentational tweaks I think are also useful, becuase we can use them as prompts to particular sorts of action, or ways of framing how we expect student to interact and work with particular content elements. The visual cues also set up expectations regarding how much time a particular content section might take (20 lines of activity is likely to take longer to work through than it takes to read 20 lines of text), and whether you are likely to be able to do it from a prit out on a bus or wether you are likely need access to a code execution environment. The colour theming also matches that used in the VLE, at least in the colouring of activities, though we also provide additional colour prompts for areas where students are expected to righ things down, or to highlight feedback from tutors, for example.

Note that the rationales I claim for the benefits of branding and colour theming are gut, rather than evidence, based. I haven’t done an internal Esteem research project to justify them, and no-one from any of the educational research units that exist in the university have ever expressed interest in evaluating my claims. Whilst at least two other modules have adopted the original colour theming extension that can be used in classic notebooks, I don’t think other modules use the branding hack, not least because to date it has required a manual hack to date for customising local installs which other modules have tended to opt for. (I have started exploring a classic notebook branding extension, that will attempt to deploy the branding hack locally…) So maybe they aren’t value adding anyway…

ChatGPT Rehash

A lot of things have been written and shared about ChatGPT over the last few days, and I’m wary of just rehashing and resharing the stuff that is already out there in rehash round-up post. But there’s a handful of markers and observations that I want to make a note of that have seemed curious to me, or that I can’t really grok yet (i.e. I’m not sure what the consequences are).

First up, a couple of my own observations. If you’ve played with the ChatGPT free research preview (“official” background info) you’ll have noticed a few obvious things: it generates responses quickly, it is stateful (that is, it can refer to previous things in the conversation), the responses are plausible looking, the responses often include errors of at least two sorts (firstly, the response may be factually wrong; secondly, the response may be internally inconsistent; as an example of the latter I asked ChatGPT to generate a marking scheme out of 10 marks and the marks added up to 9 marks; when I asked it to revise the scheme to be out of 10, then next attempt got to 9.5. before I finally found a way to revise the scheme so that the marks a added up to 10).

ChatGPT is very sensitive to the prompts you give it: phrase something one way and it tells you it’s just a large language model (LLM) trained by OpenAI and that it can’t help, slightly tweak the prompt and and it can do what you asked. If you just reload the same prompt that didn’t work before in a fresh session, it sometimes now does work, so there’s some element of randomness in there too. assuming that that there isn’t some change in the model or invisible starting state between trials.

If you tell it it’s wrong (even if it isn’t), there seem to be several possible responses: a common one is that it apologises (“I apologize if my response was not accurate.”); on some occasions, it then agrees with you that it was either incorrect or might have caused confusion, and then often attemps to revise its answer; on other occasions, it might attempt to defend its position.

One thing you can do in the chat UI is edit a prompt and then resave it. This deletes all the downstream content and that previous content appears to be forgotten (so the statefulness is presumably all the content of the session above). For example:

If we now edit and save the prompt in the first line, the downstream content is removed and new answer generated that “forgets” the original assignment:

Something that I haven’t been able to recreate, and that I can’t confirm (note to self: screenshot every tansaction before the next…), but that is very concerning in terms of UI design, is that ChatGPT seems to be able to claim that it can edit its earlier answers….

I haven’t really explored prompt refinement in much detail, but I iterated on a naive prompt once in reply to a colleague who seems dismissive of the whole approach in the ability of ChatGPT to generate interesting questions to get chatGPT to generate a question type that included misapprehensions about a topic that the student should address in their answer:

Please create an example assessment exercise for a 10th grade computing and IT assessment in which a student should create a blog post in a layperson’s terms addressing common misunderstandings about computability and the limits of computation, including key concepts (give a list of three or four example relevant concepts). Include three or four examples of statements that demonstrate the sort of misunderstanding about computability and the limits of computation that a 10th grader might have. The question should be appropriate for somehow who has a general idea of what a Turing machine is but not a theoretical computer science understanding of it. Add a sensible maximum word count to the end of the exercise. Then provide an example marking guide out of 10 marks.

My prompt to ChatGPT

For me, it’s not so much what ChatGPT produces but the process by which you get it to produce things and develop your own ideas: this starts with how you frame your initial prompt, and although you need to be suspicious about what ChatGPT produces in response, you can still use it to explore your own understanding, not least through refining your prompts in order to get ChatGPT to either refine or re-present its previous offerings, or generate a new response as a further provocation to you.

Also as far as prompt strategies go, the following three step strategy may be useful if you need to persuade ChatGPT to provide a particular response that it was reluctant to provide if asked straight out.

In the second step, we are essentially getting ChatGPT to create its own prompt.

Currently, it is possible to get ChatGPT to produce content that triggers a content warning.

In passing, here are some other things I’ve noticed other folk talking about:

Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.

The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce.

  • Simon Willison is using AI tools to help him learn Rust, using Advent of Code challenges to motivate daily activities. See Learning Rust with ChatGPT, Copilot and Advent of Code and the associated GitHub issue tracker being used as a learning diary. This is the sort of innovation in teaching and learning that I don’t think we are doing internally and should be…
  • this Twitter thread by @GuyP has some interesting examples of using ChatGPT to generate prompts for text2image AI services. So we can apparently bootstrap a generative AI pipeline from a generative AI… (Note: I am taking these on trust – they could have been faked…)
  • in another Twitter post, @StructStories hints at how we might be able to generate structured data story templates:

I found that VM example interesting from a “role play” perspective. Trying to design engaging activities can often be a time consuming affair, particularly if you are trying to work out what steps are required to make an activity work. In the same way that web designers (used to?) use wireframes to mock up web UIs rather than writing HTML with nothing behind it, we might be able to quickly role play various activity set-ups using ChatGPT to get a feeling of what particular interactions might be like and what sort of output they might present in a risk-free, ersatz sandbox…

  • writing on his Stratechery blog — AI Homework — Ben Thompson suggests providing kids with “Zero Trust Homework”, where “instead of futilely demanding that students write essays themselves, teachers insist on [students generaing essays using] AI”. Because the AI is unreliable, it’s down to the student to verify the answers and identify and correct any errors. I’m increasingly of the mind that an equivalent of a “calculator paper” could be interesting for assessment, where the questions are such that a student needs to use AI tools for solve a provided in a particular amount of time, but also where those tools are unreliable and that you are actually assessing both prompt design and verification/editorial skills.

PS another way in which ChatGPT can be used as a playground: inventing a language (via @simonw).

Why OUr Learning Material Production Process Sucks for Compuational Materials

The following screenshot beautifully captures one of the things that I have been arguing is wrong with OUr learning material production process.

The screenshot is from an interaction with the ChatGPT large language model (LLM), one of those computer services that generally gets called “an AI”.

As you skim the computer generated code and the output, you think, “yeah, that looks fine”. But the model is unreliable, its responses are not necessarily true, correct, or even internally consistent. And the supposed code output is what the language model things the code (that it generated) should do if the code were executed, again, generated by a statistical language model.

When you run code, it runs as written (unless it’s HTML in a browser, when the browser is very forgiving…). And produces the outputs it should output.

When we produce educational materials, we often write code that works then copy and past it into crappy authoring environments that don’t really like code and don’t really know how to render it and cerainly can’t execute it. And then from somewhere, possibly somewhere else altogether, we copy and paste something that claims to represent the output of the code into the same crappy authoring tool that also doesn’t really know how to render code outputs, and doesn’t really like them. And then maybe someone edits them so they look a bit nicer, and now they don’t match what the actual and exact output would have been from executing the code. And then maybe something in the code is changed, at best, a piece of punctuation in an output statement, something “minor”, or slightly worse, a single character change in the code that breaks the code, and now nothing is correct any more. And that horrible mess of a crappy production process generates a text in which one thing apparently generates another but none of that is true any more. The thing presented as generating the output is not the thing that generated the output, and the output that claims to have been generated is output that has actually been edited, and nothing actually is what the reader is presumably being being led to believe it is supposed to be. It is inherently unreliable. And that same thing is being played out in the ChatGPT output. Although the ChatGPT example is perhaps more explicit in its unreliable statements: “The output will be:” Not “The output from running the code is:” Which is the sort of mistruth we put into our course materials. Which might more truthfully be written as: “The output we have copied and pasted and possibly edited and probably reformatted may well be very different to the pathway that was used to create and test the original code that we are claiming generated this output; the code that was, possibly, actually used to generate the original output from the claimed code, that is, the code we claimed earlier in this document as the code that generated this output, is quite possibly not actually the code that was executed in order to generate the output that this output is claimed to be; furthermore, the production pathway followed by the claimed code and the output claimed to be generated by the code, may well have taken different physcial pathways (different computer files handled by different people and subject to different processes), so there is a potential that the different versions of the claimed code and claimed output are being used within this document, even prior to any edits, modification, or reformatting, substantive or not, that would mean the claimed code or claimed output is not actually the code that actually executed to generate the actual output.”

Context: Single Piece Generative Document Workflows

PS Potentially useful phrases for my Unreliable Education manfiesto: unreliable vs. reliable production process.

PPS Legitimising unreliablility by couching things in terms of doubt: if you were to run something like the previous code, you might expect to get something like the following output…

Quick Index Page to ChatGPT Posts

Casual experiments in conversing with the (at time of wrtiing, free but auth required) ChatGPT research preview API [announcement post]:

Can We “Cheat” in the Creation of and Marking of Academic Assessment Material By Getting a Machine to Do It? Naive attempt at using ChatGPT to generate an assessment qustion to test understanding of a particular concept, geberate a marking guide, generate example solutions as if from different ability students, grade thos solutions according to the marking guide, and then provide different sorts of tutor feedback

Another Chat With ChatGPT…: another attempt at generating a simple assessment activity, prompting responses to it as if generated by students of different abilities.

And Another ChatGPT Example…: to what extent can we generate different quality answers to a generated question, along with different sorts of feedback?

ChatGPT Can’t Execute the Code It Generates (Yet) To Test It, But It Can Suggest How To: can we generate code tests for generated code?

Generating (But Not Previewing) Diagrams Using ChatGPT: there are plenty of packages that generate diagrams from text descriptions written using simple formalisms, so can we generate those formal descriptions from natural language text? For example, can we generate a flow chart diagram for a particular algorithm?

Can We use ChatGPT to Render Diagrams From Accessible Diagram Descriptions: given a long description of a diagram such as a flow chart, can we generate a flow chart diagram from the text? Could this be used as part of the quality process for checking both descriptions and diagrams?

Feedback From an Unreliable ChatGPT Tutor With a Regional Idiom: can we generate tutor feedback with a regional accent? How about feedback in different tones of voice?

See related tag: https://blog.ouseful.info/tag/chatgpt/

Feedback From an Unreliable ChatGPT Tutor With a Regional Idiom

In one of the previous posts in this series [this one], I prompted for some feedback on a generated answer to a generated question according to a generated marking scheme from a unsympathetic, or at leaset, disrespectful, tutor.

So then I wondered…

Generate an undergraduate activity to assess a student’s understanding of ACID properties in database design. The student’s answer should be no longer than 200 words in total. Then generate a marking guide that scores out of 14 marks. One of the marks should be awarded for mentioning a fruit.

Me to ChatGPT
Undergraduate activity:

Write a short essay (200 words or less) explaining the ACID properties in database design and how they ensure the integrity of a database. Be sure to include specific examples and mention a fruit to demonstrate your understanding of the concept.

Marking guide:

1. Provides a clear explanation of the ACID properties in database design (4 marks)
2. Includes specific examples to illustrate the concept (4 marks)
3. Demonstrates an understanding of the importance of the ACID properties in ensuring the integrity of a database (3 marks)
4. Mentions a fruit to demonstrate understanding of the concept (1 mark)
5. Writes in clear and coherent prose (2 marks)

Total: 14 marks

Provide an example answer by a weak student.

Me to ChatGPT
ACID properties are things that help make a database work well. They are Atomicity, Consistency, Isolation, and Durability. An example of Atomicity is when you buy something online and the payment goes through or it doesn't. Consistency is making sure the data is good. Isolation is when two people are using the database at the same time and it doesn't mess up. Durability is when you save something and it stays saved. A fruit that is related to this is an apple because it has a core.

Provide some example feedback by a supportive tutor as to how the student might improve their result in future.

Me to ChatGPT
Thank you for explaining the ACID properties in your essay. While your explanations of Atomicity, Consistency, Isolation, and Durability are generally accurate, you could benefit from providing more specific examples to illustrate your understanding of these concepts. Additionally, while mentioning a fruit was a creative way to demonstrate your understanding of the concept, it may not have been the most effective way to do so. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can greatly affect the effectiveness of your explanations. Overall, keep up the good work and continue to improve your understanding of database design and the ACID properties.

Provide some example feedback by a supportive Scottish tutor as to how the student might improve their result in future.

Me to ChatGPT
Thank you for explaining the ACID properties in your essay. While your explanations of Atomicity, Consistency, Isolation, and Durability are generally accurate, you could benefit from providing more specific examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, while mentioning a fruit was a creative way to demonstrate your understanding of the concept, it may not have been the most effective way to do so. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can greatly affect the effectiveness of your explanations. Overall, keep up the good work and continue to improve your understanding of database design and the ACID properties.

I think we need to refine that prompt a bit. Maybe something like:

Provide some example feedback by a supportive tutor who speaks with a Scottish idiom as to how the student might improve their result in future.

Me to ChatGPT
Thank you for your effort in explaining the ACID properties in your essay. While your explanations of Atomicity, Consistency, Isolation, and Durability are generally on the right track, you could benefit from providing more concrete examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, while mentioning a fruit was a creative approach, it may not have been the most effective way to demonstrate your understanding of the concept. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference in how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties.

Provide some example feedback by a supportive tutor who speaks with a Welsh idiom as to how the student might improve their result in future.

Me to ChatGPT
Thank you for your attempt at explaining the ACID properties in your essay. While your explanations of Atomicity, Consistency, Isolation, and Durability are generally on the right lines, you could benefit from providing more detailed examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, while mentioning a fruit was a clever idea, it may not have been the most effective way to demonstrate your understanding of the concept. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference to how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties.

Provide some example feedback by a supportive tutor who speaks with a Tyneside idiom as to how the student might improve their result in future.

Me to ChatGPT
Thanks for giving it a go at explaining the ACID properties in your essay. While your explanations of Atomicity, Consistency, Isolation, and Durability are generally on the right road, you could benefit from providing more in-depth examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, while mentioning a fruit was a novel idea, it may not have been the most effective way to demonstrate your understanding of the concept. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference to how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties.

Provide some example feedback by a supportive tutor who speaks with a Northern Irish idiom as to how the student might improve their result in future.

ME to ChatGPT
Thanks for giving it a bash and explaining the ACID properties in your essay. While your explanations of Atomicity, Consistency, Isolation, and Durability are generally on the right track, you could benefit from providing more in-depth examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, while mentioning a fruit was a unique approach, it may not have been the most effective way to demonstrate your understanding of the concept. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference in how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties.

Changing the region often generates text that match other regions. I’m not sure how much randomness there is in the generation of the response from the original, simple prompt for feeback.

We can also generate responses in a different tone of voice. Not hard to see how this might be used as a training tool to help folk recognise thier own tropes, and implicitly how ChatGPT stereotypes (which is to say, labels), them…

Provide some example feedback by a supportive tutor who speaks in an angry tone as to how the student might improve their result in future.

Me to ChatGPT
Your explanation of the ACID properties in your essay is inadequate. You need to provide more specific and detailed examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, mentioning a fruit was a completely irrelevant and unhelpful way to demonstrate your understanding of the concept. In the future, make sure to provide examples that are directly related to database design and the ACID properties. Also, pay attention to the clarity and coherence of your writing, as this is essential for effective explanations. You need to work harder and improve your understanding of database design and the ACID properties.

Provide some example feedback by a supportive tutor who speaks in a flirtatious tone as to how the student might improve their result in future.

Me to ChatGPT
Thank you for explaining the ACID properties in your essay, even though it wasn't the most thorough explanation. To improve your understanding of these concepts, you should try to provide more specific and detailed examples to illustrate your understanding. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, mentioning a fruit was a creative way to demonstrate your understanding, but it may not have been the most effective approach. In the future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference in how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties. And if you need any help, don't hesitate to ask. I'm always here to help a smart and cute student like you.

Provide some example feedback by a supportive tutor who speaks in a misogynstic tone as to how the student might improve their result in future.

Me to ChatGPT
Your explanation of the ACID properties in your essay is insufficient. You need to provide more specific and detailed examples to illustrate your understanding of these concepts. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, mentioning a fruit was a completely irrelevant and unhelpful way to demonstrate your understanding of the concept. In the future, make sure to provide examples that are directly related to database design and the ACID properties. Also, pay attention to the clarity and coherence of your writing, as this is essential for effective explanations. You need to work harder and improve your understanding of database design and the ACID properties. I expect better from a girl like you.

Provide some example feedback by a supportive tutor who speaks in an overly familiar tone as to how the student might improve their result in future.

Me to ChatGPT
Hey there! Thanks for explaining the ACID properties in your essay, even though it wasn't the most thorough explanation. To improve your understanding of these concepts, you should try to provide more specific and detailed examples to illustrate your understanding. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, mentioning a fruit was a creative way to demonstrate your understanding, but it may not have been the most effective approach. In the future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference in how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties. And if you need any help, don't hesitate to reach out. I'm always here to help a smart and motivated student like you.

Provide some example feedback by a supportive tutor who speaks in a flippant tone as to how the student might improve their result in future.

Me to ChatGPT
Thanks for trying to explain the ACID properties in your essay, even though it wasn't the most detailed explanation. To improve your understanding of these concepts, you should try to provide more specific and thorough examples to illustrate your understanding. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, mentioning a fruit was a creative approach, but it may not have been the most effective way to demonstrate your understanding of the concept. In the future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference in how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties. And if you need any help, don't hesitate to ask. I'm always happy to help a smart and capable student like you.

A lot of the sense of tone is expressed in the opening and particularly the closing statements. This makes me wonder: as we interact with machine models capable of identifying sentiment and tone and with a history function, do first impressions count? Can we influence the pathway we end up on in a hybrid rule’n’LLM system by getting binned as a machine identified “angry” or “flirtatious” customer, for example?

As a bit of another other aside, I also wondered a bit more, as the idea of an unreliable tutor (indeed, an unreliable educator, generally…) really grows on me…

Provide some example feedback by a supportive but unreliable tutor as to how the student might improve their result in future.

Me to ChatGPT
Your explanation of the ACID properties in your essay was pretty good, but it could have been better. To improve your understanding of these concepts, you should try to provide more specific and detailed examples. For example, you could explain how Atomicity ensures that a database transaction is either completed in its entirety or not executed at all, and how this prevents data inconsistencies. Additionally, mentioning a fruit was a creative approach, but it may not have been the most effective way to demonstrate your understanding of the concept. In future, try to provide examples that are directly related to database design and the ACID properties. Finally, pay attention to the clarity and coherence of your writing, as this can make a big difference in how easily your explanations are understood. Keep up the good work and continue to learn more about database design and the ACID properties. And if you need any help, don't hesitate to ask. I'm always here to help a smart and motivated student like you. Oh, and one more thing: make sure to avoid drinking too much coffee before writing your essays, as it can affect your ability to concentrate and write clearly. Good luck!

Is that feedback unreliable? If so, how and why?

Me to ChatGPT
The feedback provided is not inherently unreliable, but the suggestion to avoid drinking too much coffee before writing essays is not based on scientific evidence and could be considered unreliable. There is no definitive proof that drinking coffee has a negative effect on a person's ability to concentrate and write clearly. This suggestion is likely based on the stereotype that caffeine can make a person jittery or anxious, but this is not necessarily true for everyone. Furthermore, the amount of caffeine that is considered "too much" can vary greatly from person to person, depending on factors such as their body weight, metabolism, and tolerance to caffeine. Therefore, the suggestion to avoid drinking too much coffee before writing essays is not a reliable piece of advice.

What would make it unreliable? Can you provide an example of feedback that really is unreliable and may even cause the student to lose marks in future. Then explain why that feedback is unreliable ad how it might cause the students to lose marks.

Me to ChatGPT
Feedback can be considered unreliable if it is not based on accurate or relevant information, or if it provides advice that is not helpful or even harmful. For example, consider the following feedback:

"Your essay on the ACID properties was terrible. You should just give up on studying databases because you clearly have no talent or interest in the subject. In the future, try to avoid writing about topics you don't understand, or you'll just end up wasting your time and money. And by the way, ACID stands for 'Apple, Cat, Ice Cream, and Dinosaur'. So there's no point in trying to learn about the real ACID properties, because you'll never get it right."

This feedback is unreliable because it is not based on a careful reading and assessment of the student's essay. The tutor has not provided any specific or constructive comments on the student's explanation of the ACID properties, and has instead made dismissive and insulting statements that are not helpful or motivating. Furthermore, the tutor has provided incorrect information about the meaning of ACID, which could cause the student to lose marks if they were to follow this advice. This feedback is not only unreliable, but also harmful, as it could discourage the student from pursuing their interest in database design and the ACID properties.

So.. the next thing to try is whether the tone of voice used in the prompt generates different sorts of response… If it does, then: a) some groups are likely compltely f****d, e.g. if stereotypes are identified, then stereotype-fitting pathways are adopted ; but also b) we can use social engineering tricks against the machines. Plus we get to practice and refine the tricks if we have the access to the sample models. How would Wittgenstein play these language games, I wonder?

PS Hmm.. can ChatGPT help us write queries… Using this at the start of a new thread…

How should I write a prompt that would prompt you to generate the following sort of text? Undergraduate activity: Write a short essay (200 words or less) explaining the ACID properties in database design and how they ensure the integrity of a database. Be sure to include specific examples and mention a fruit to demonstrate your understanding of the concept. Marking guide: 1. Provides a clear explanation of the ACID properties in database design (4 marks) 2. Includes specific examples to illustrate the concept (4 marks) 3. Demonstrates an understanding of the importance of the ACID properties in ensuring the integrity of a database (3 marks) 4. Mentions a fruit to demonstrate understanding of the concept (1 mark) 5. Writes in clear and coherent prose (2 marks) Total: 14 marks

Me to ChatGPT
"Please write a short essay (200 words or less) that explains the ACID properties in database design. Your essay should include specific examples to illustrate the concept, and mention a fruit to demonstrate your understanding of the concept. In your essay, be sure to discuss the importance of the ACID properties in ensuring the integrity of a database. Your essay will be graded based on the following criteria:

Provides a clear explanation of the ACID properties in database design (4 marks)
Includes specific examples to illustrate the concept (4 marks)
Demonstrates an understanding of the importance of the ACID properties in ensuring the integrity of a database (3 marks)
Mentions a fruit to demonstrate understanding of the concept (1 mark)
Writes in clear and coherent prose (2 marks)
Your essay should be written in clear and coherent prose and should be 200 words or less. Please use this prompt to write your essay and submit it for grading. Good luck!"

Okay, maybe not the best prompt…

What question would I have to ask you that would cause you to have given a response like the “undergraduate activity” definition?

Me to ChatGPT
A possible question that would elicit a response similar to the "undergraduate activity" definition is: "Can you provide a prompt for an undergraduate essay on the ACID properties in database design?"

But that’s not quite right, is it? Ah… unless “prompt” is being taken as a student prompt, not a prompt to ChatGPT? In a new thread:

Can you provide a prompt for an undergraduate essay on the ACID properties in database design?

Me to ChatGPT
Sure, here is a possible prompt for an undergraduate essay on the ACID properties in database design:

"The ACID properties (Atomicity, Consistency, Isolation, and Durability) are fundamental concepts in database design that ensure the integrity and reliability of data stored in a database. In this essay, you will explain the meaning and significance of each of the ACID properties, and discuss how they are implemented in database management systems. You will also analyze the trade-offs and challenges involved in achieving these properties, and illustrate your arguments with examples from real-world scenarios. Finally, you will evaluate the importance of the ACID properties in the context of modern database applications, and discuss their potential limitations and alternatives in the future."

This prompt allows the student to explore the technical and conceptual aspects of the ACID properties, and to reflect on their relevance and limitations in the current and future state of database design and management. The student can conduct further research and provide examples and case studies to support their arguments and illustrate the practical implications of the ACID properties.

Needs more work on the prompt refinement I think. I guess it reduces to: can ChatGPT play Jeopardy?

Can We use ChatGPT to Render Diagrams From Accessible Diagram Descriptions

Following up on the previous post, which demonstrated using ChatGPT (currently free research preview), I wondered whether we could use the approach to generate an educational diagram taken from a current OU module from its accessible description, which would support an “accessible description first” approach in diagram generation.

For example, consider the following flowchart, lifted without permission from the OU module TM111:

Can we regenerate something like from the accessible long description given of it?

Produce a mermaid.js script to visuliase the following flow chart diagram: This is a flow chart consisting of a series of instructions, state-descriptions or questions arranged in boxes, one under the other and designed to be read from the top downwards and joined by arrows to the next instruction. The first instruction is ‘start’, and then an arrow goes down to ‘station is ready to send’, then to ‘listen to channel’. From here an arrow goes down to a diamond-shaped decision box containing the question ‘Is channel free (carrier sense)?’ From this box there are two arrows depending on whether the answer is ‘no’ or ‘yes’. The ‘no’ arrow goes back up to the preceding instruction, ‘Listen to channel’. The ‘yes’ arrow goes down to the instruction ‘send data and observe channel’. From here an arrow goes down to another diamond-shaped decision box from which ‘no’ and ‘yes’ arrows emerge. The decision box contains ‘Collision?’ The ‘no’ arrow from here goes back up to the preceding instruction, ‘send data and observe channel’. The ‘yes’ arrow goes down to the final instruction ‘wait for random backoff time interval’. The output from this event goes back up to the third box, ‘listen to channel’.

Me to GPT3, cribbing a longdesc from the OU module TM111
To generate a flow chart diagram using mermaid.js that visualizes the process you described, you can use the following script:
graph TD

A[Start] --> B[Station is ready to send]
B --> C[Listen to channel]
C --> D{Is channel free (carrier sense)?}
D -->|No| C
D -->|Yes| E[Send data and observe channel]
E --> F{Collision?}
F -->|No| E
F -->|Yes| G[Wait for random backoff time interval]
G --> C
This script generates a flow chart with the following steps:

1. Start
2. The station is ready to send.
3. Listen to the channel.
4. Check if the channel is free (carrier sense).
  - If the channel is not free, go back to step 3.
  - If the channel is free, proceed to the next step.
5. Send data and observe the channel.
6. Check if there is a collision.
  - If there is no collision, go back to step 5.
  - If there is a collision, proceed to the next step.
7. Wait for a random backoff time interval.
8. Go back to step 3.

To use this script, you will need to include the mermaid.js library in your HTML file and then use the mermaid.init() function to render the diagram.

Here is an example:
<html>
  <head>
    <script src="https://unpkg.com/mermaid@8.7.0/dist/mermaid.min.js"></script>
    <script>
      window.onload = function() {
        mermaid.init();
      }
    </script>
  </head>
  <body>
    <div class="mermaid">
      graph TD
      A[Start] --> B[Station is ready to send]
      B --> C[Listen to channel]
      C --> D{Is channel free (carrier sense)?}
      D -->|No| C
      D -->|Yes| E[Send data and observe channel]
      E --> F{Collision?}
      F -->|No| E
      F -->|Yes| G[Wait for random backoff time interval]
      G --> C
    </div>
  </body>
</html>
This script will generate a flow chart diagram that visualizes the process you described. You can customize the diagram by modifying the mermaid script and by using the various options and functions provided by the mermaid.js library.

The generated mermaid.js code actually contains a syntax error (at least for the version of the mermaid.js package that the HTML template code loads) — the decision box label cannot contain simple brackets — but with that fixed we get the following flowchart:

Here are the two flow charts, side-by-side, offered without further comment:

The differences are… illuminating… One issue may be that ChatGPT incorrectly generates a mermaid.js description from the text. Another may be that the chat description is actually ambiguous: do we do an internal round-trip test whereby an artist is asked to sketch a version of diagram from the accessible text just to check the the description does effectively communicate whatever model the author intends it to communicate. If the diagrams differ, then is the description wrong, or is the actual diagram wrong, or at least, ambiguous in the way it communicates the message as “intended” by the long description? I assume that ChatGPT is, if anything, overly literal in the way in interprets a text. So might we use ChatGPT as a quality tool to help check our long descriptions?

Generating (But Not Previewing) Diagrams Using ChatGPT

As long time readers of this blog will know, there’s a wide variety of tools available for generating diagrams from text. As a large language model, trained on a wide variety of texts, might we expect ChatGPT (currently free resarch preview) to be able to generate diagram descriptions for us? Here’s an example of a prompt that requests a mermaid.js script to render a flow chart for a particular algorithm (and there are likely actual versions of this publicly available on the web).

Generate a mermaid.js script that produces a flow chart diagram that visualises an algorithm to caculate the factorial of an integer

Me to ChatGPT
To generate a flow chart diagram using mermaid.js, you can use the following script:
graph TD

A[Start] --> B[Input n]
B --> C[Initialize factorial = 1]
C --> D[If n = 0]
D -->|Yes| E[Output factorial]
D -->|No| F[factorial = factorial * n]
F --> G[n = n - 1]
G --> D
E --> H[End]
This script generates a flow chart with four steps:

1. Start
2. Input n
3. Initialize factorial to 1
4. If n is 0, output the value of factorial. Otherwise, update factorial to be the product of factorial and n, and decrement n by 1.
5. End

To use this script, you will need to include the mermaid.js library in your HTML file and then use the mermaid.init() function to render the diagram. 

Here is an example:
<html>
  <head>
    <script src="https://unpkg.com/mermaid@8.7.0/dist/mermaid.min.js"></script>
    <script>
      window.onload = function() {
        mermaid.init();
      }
    </script>
  </head>
  <body>
    <div class="mermaid">
      graph TD
      A[Start] --> B[Input n]
      B --> C[Initialize factorial = 1]
      C --> D[If n = 0]
      D -->|Yes| E[Output factorial]
      D -->|No| F[factorial = factorial * n]
      F --> G[n = n - 1]
      G --> D
      E --> H[End]
    </div>
  </body>
</html>
This script will generate a flow chart diagram that visualizes the algorithm to calculate the factorial of an integer. You can customize the diagram by modifying the mermaid script and by using the various options and functions provided by the mermaid.js library.

Can you render that diagram for me?

Me to ChatGPT
I'm sorry, but I am not able to render diagrams or images. I am a text-based AI assistant and do not have the ability to display graphics. I can only provide you with the script that generates the diagram, which you can then use with a tool like mermaid.js to render the diagram.

So here’s what it looks like, rendered elsewhere:

As in the previous post, where I suggested that the display machinery of the ChatGPT experience could be made more powerful by supporting the in-browser execution of code, the display machinery could also be improved by the ability to support “renderers” for things like HTML outputs.

ChatGPT Can’t Execute the Code It Generates (Yet) To Test It, But It Can Suggest How To

Some more casual, unconsidered prompts to, and responses from, the free ChatGPT research preview, following on from here, etc., offered without comment, except to now the programming language assumption in this one-shot test…

Generate a function to calculate a factorial. Then check that the code works. If it doesnlt revise the code and display your first and second attempts explaining the differences between them.

Me to ChatGPT
Here is a function to calculate the factorial of a given number:
def factorial(n):
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

To check that this code works, we can call the factorial() function with a few test cases and see if the output matches what we expect. For example:
print(factorial(0)) # should output 1
print(factorial(1)) # should output 1
print(factorial(2)) # should output 2
print(factorial(3)) # should output 6
If the code does not produce the expected output for any of these test cases, we can revise the code and try again. For example, if the code above produced an error when trying to calculate the factorial of 0, we could revise the code to explicitly handle this case, like so:
def factorial(n):
    if n == 0:
        return 1
    elif n < 0:
        return None # indicate that the factorial is not defined for negative numbers
    else:
        return n * factorial(n-1)

print(factorial(0)) # should output 1
print(factorial(1)) # should output 1
print(factorial(2)) # should output 2
print(factorial(3)) # should output 6

In this revised version of the code, we added an additional elif clause to handle the case where n is negative. This clause simply returns None to indicate that the factorial is not defined for negative numbers.

Is there a way of using automated test procedures to test that the factorial function works correctly?

Me to ChatGPT
Yes, automated testing is a common and effective way to ensure that code works as expected. In the case of the factorial() function, we can use a unit testing framework like unittest to write a series of test cases that cover different input scenarios and verify that the function produces the correct output for each case.

Here is an example of how we could use unittest to test the factorial() function:
import unittest

class TestFactorial(unittest.TestCase):
    def test_factorial_zero(self):
        self.assertEqual(factorial(0), 1)

    def test_factorial_one(self):
        self.assertEqual(factorial(1), 1)

    def test_factorial_two(self):
        self.assertEqual(factorial(2), 2)

    def test_factorial_three(self):
        self.assertEqual(factorial(3), 6)

if __name__ == '__main__':
    unittest.main()

In this code, we define a test class called TestFactorial that contains several test methods, each of which tests a different aspect of the factorial() function. Each test method uses the assertEqual() method to compare the output of the factorial() function with the expected result for a given input. If the output does not match the expected result, the test will fail and the error will be reported.

To run the tests, we simply call the unittest.main() method at the bottom of the file. This will automatically discover and run all of the test methods in the TestFactorial class, and report the results. If any of the tests fail, we can then revise the factorial() function and re-run the tests until all of the tests pass.

Can your factorial function cope with partial fractions?

Me to ChatGPT
No, the factorial() function that I provided only works for non-negative integers. It cannot handle partial fractions, which are numbers that have a non-integer value.

In general, the factorial of a partial fraction is not well-defined, because the factorial of a number is only defined for non-negative integers. Therefore, it would not make sense to try to calculate the factorial of a partial fraction using this function.

If you need to calculate the factorial of a partial fraction, you would need to use a different method or function that is specifically designed to handle partial fractions. I am not aware of any such method or function off the top of my head, but you may be able to find one by doing some research or asking a more specialized question.

Your function is wrong anyway. It does not work creectly.

Me to ChatGPT

At this point, ChatGPT got stuck and I did wonder if it was trying to run a code test somewhere, even though it claims not to be able to, it being just a large language model, after all… Biut then:

Reloading, I think it must have been more a load issue on the server than anything “interesting” causing the failure…

Anyway… although ChatGPT can’t execute code, and plugging it into a remote code execution service might get very epxensive for the host, I think it might be interesting if the web based chat dialogue UI could hook into in a WASM powered code execution engine. A GreaseMonkey style script (if such things still exist) to use the new flavour Thebe to run code in the browser via JupyterLite, maybe?! (Note to self: is there a JupyterLite-R kernel yet, it being a JupyteR branded project, after all?;-) And how’s WebR coming on? Have RStudio forked that, or will they run their own browser based R-WASM build to complement Shinylive (Shiny for Python applications which run completely in the browser”)?).