Idle Thoughts — ChatGPT etc. in Education

There’s a lot of not much substance that’s been written recently about the threat posed to education (and, in particular, assessment) by ChatGPT and it’s ilk. The typical starting point is that ChatGPT can (rather than could) be used as a universal cheating engine (CheatGPT?), therefore everyone will use it to cheat.

In the course I work on, the following statement defines the assessment contract we have with students:

PLAGIARISM WARNING – the use of assessment help services and websites

The work that you submit for any assessment/exam on any module should be your own. Submitting work produced by or with another person, or a web service or an automated system, as if it is your own is cheating. It is strictly forbidden by the University. 

You should not

  • provide any assessment question to a website, online service, social media platform or any individual or organisation, as this is an infringement of copyright. 
  • request answers or solutions to an assessment question on any website, via an online service or social media platform, or from any individual or organisation. 
  • use an automated system (other than one prescribed by the module) to obtain answers or solutions to an assessment question and submit the output as your own work. 
  • discuss exam questions with any other person, including your tutor.

The University actively monitors websites, online services and social media platforms for answers and solutions to assessment questions, and for assessment questions posted by students. Work submitted by students for assessment is also monitored for plagiarism. 

A student who is found to have posted a question or answer to a website, online service or social media platform and/or to have used any resulting, or otherwise obtained, output as if it is their own work has committed a disciplinary offence under Section SD 1.2 of our Code of Practice for Student Discipline. This means the academic reputation and integrity of the University has been undermined.  

The Open University’s Plagiarism policy defines plagiarism in part as: 

  • using text obtained from assignment writing sites, organisations or private individuals. 
  • obtaining work from other sources and submitting it as your own. 

If it is found that you have used the services of a website, online service or social media platform, or that you have otherwise obtained the work you submit from another person, this is considered serious academic misconduct and you will be referred to the Central Disciplinary Committee for investigation.    

It is not uncommon in various assessment questions to see assessment components that either implicitly or explicitly provide instructions of the form using a web search engine to prompt students to research a particular question. Ordinarily, this might count as “using an automated system”, although its use is then legitimised by virtue of being ” prescribed by the module”).

I haven’t checked to see whether the use of spell checkers, grammar checkers, code linters, code stylers, code error checkers, etc. are also whitelisted somewhere. For convenience, let’s call these Type 1 tools.

In the past, I’ve wondered about deploying various code related Type 1 tools (for example, Nudging Student Coders into Conforming with the PEP8 Python Style Guide Using Jupyter Notebooks, flake8 and pycodestyle_magic Linters).

In the same way that we can specify “presentation marks” in a marking guide that can be dropped to penalise misspelled answers, code linters can be used to warn about code style that deviates from a particular style guide convention. In addition, code formatter tools can also automatically correct deviations from style guides (the code isn’t changed, just how it looks).

If we install and enable these “automatic code style correction” tools, we can ensure that students provide style conforming code, but at the risk they don’t actually learn the rules or how to apply them, they just let the machine do it. Does that matter? On the other hand, we can configure the tools to display warnings about code style breaches, and prompt the student to manually correct the code. Or we can provide a button that will automatically correct the code style, but the student has to manually invoke it (ideally afer having read the warning, the intention being that at some point they start to write well-styled code without the frictional overhead of seeing warnings then automatically fixing the code).

There are other tools available in coding contexts that can be used to check code quality. For example, checking whether all required packages are imported, checking that no packages are loaded that aren’t used, or functions defined but not called, or variables assigned but not otherwise referred to. (We could argue these are “just” style breaches of a style rule that says “if something isn’t otherwise referred to, it shouldn’t be declared”.) Other rules might explicitly suggest code rewrites to improve code quality (for example, adamchainz/flake8-comprehensions or MartinThoma/flake8-simplify). Students could easily use such tools to improve their code, so should we be encouraging them to do so? (There is a phrase in many code projects that the project is “opinionated”; which is to say, the code maintainers have an opinion about how the code should be written, and often then include tools to check code conforms to corresponding rules, as well as making autocorrected code suggestions where rules are breached). When teaching coding, to what extent could, or should, we: a) take an opinionated view on “good code”; b) teach to that sandard; c) provide tools that identify contraventions against that standard as warnings; d) offer autocorrect suggestions? Which is to say, how much should we automate the student’s answering of a code question?

In many coding environments, tools are available that offer “tab autocompletion”:

Is that cheating?

If we don’t know what a function is, or what arguments to call it with, there may be easy ways of prompting for the documentation:

Is that cheating?

Using tools such as Github CoPilot (a code equivalent of ChatGPT), we can get autosuggestions for code that will perform a prompted for task. For example, in VS Code with the Github Copilot extension enabled (Github Copilot is a commercial service, but you can get free access as an educator or asna student) write a comment line describes what task you want to perform, then in the next line hit ctrl-Return to get suggested code prompts.

Is that cheating?

Using a tool such as Github Copilot, you get the feeling that there are three ways it can be used: as a simple autosuggestion tool, (suggesting the methods available on an object, for example), as a “rich autcompletion tool” where it will suggest a chunk of code that you can accept or not (as in the following screenshot), or as a generator of multiple options which you may choose to accept or not, or use as inspiration (as in the screenshot above).

In the case of using Copilot to generate suggestions, the user must evaluate the provided options and then select (or not) one that is appropriate. This is not so different to searching on a site such as Stack Overflow and skimming answers until you see somehting that looks like the code you need to perform a particular task.

Is using Stack Overflow cheating, compared to simply referring to code documentation?

In its most basic form, code documentation simply describes available functions, and how to call them (example):

But increasingly, code documentation sites also include vignettes or examples of how to perform particular tasks (example).

So is Stack Overflow anything other than a form of extended community documentation, albeit unofficial? (Is it unofficial if a package maintainer answers the Stack Overflow question, and then considers adding that example to the official documentation?)

Is Github Copilot doing anyhing more than offering suggested responses to a “search prompt”, albeit generated repsonses?

Via @downes, I note a post Ten Facts About ChatGPT

“To avoid ChatGPT being used for cheating, there are a few different steps that educators and institutions could take. For example, they could:

  • Educate students and educators on the ethical use of AI technology, including ChatGPT, in academic settings.
  • Develop guidelines and policies for the use of ChatGPT in academic work, and make sure that students and educators are aware of and follow these guidelines.
  • Monitor the use of ChatGPT in academic settings and take appropriate action if it is used for cheating or other unethical purposes.
  • Use ChatGPT in ways that support learning and academic achievement rather than as a replacement for traditional forms of assessment. For example, ChatGPT could provide personalized feedback and support to students rather than as a tool for generating entire papers or exams.
  • Incorporate critical thinking and ethical reasoning into the curriculum, to help students develop the skills and habits necessary to use ChatGPT and other AI technology responsibly.”

In my own use of ChatGPT, I have found it is possible to get ChatGPT to “just” generate responses to a question that may or may not be useful. Just like it’s possible to return search results from a web search engine from a search query or search prompt. But in the same way that you can often improve the “quality” of search results in a web search engine by running a query, seeing the answers, refining or tuning your own understanding of what you are looking for based on the search results, updating the query, evaluating the new search results, so too does iterating on, or building on from in conversational style, a prompt in ChatGPT (or Github Copilot).

Just as it might be acceptable for a student to search the web, as well as their course materials, set texts, or other academic texts they have independently found or that have been recommended to them by tutors or subject librarians, to support them in performing an assessment activity, but not acceptable to just copy and paste the result into an assessment document, it would also seem reasonable to let students interact with tools such as ChatGPT to help them come to an understanding of a topic or how to solve a particular problem, but in a “but don’t tell me the answer” sort of a way. The reward for the student should not be an extrinsic reward in the form of a mark, but the intrinsic reward of having answered the question themself: “I did that.” Albeit with a lot of tutor (or ChatGPT) help.

The issue is not so much around “[e]ducating students and educators on the ethical use of AI technology, including ChatGPT, in academic settings”, it’s around educating students on what the point of assessment is: to provide some sort of levelling feedback on how well they understand a particular topic or can perform a particular task based on their performance on a contrived activity and according to a particular evaluation scheme.

If we tweaked our assessment model to an open assessment, auction based variant that combines peer assessment and self-assessment, in which students are given assessments for their marks, then choose their own anticipated grade, then get to see everyone else’s script and personal anticipated grade, then get to revise their grade, then get randomly vivaed at a rate dependent on rank (highest ranking are more likely to be vivaed), would we be any worse off? Would we even need the initial step where a marker marks their scripts? Could we replace the viva with actual marking of a script. And let the Hunger Games begin as students crib between themselves how to score their work?

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

2 thoughts on “Idle Thoughts — ChatGPT etc. in Education”

  1. Pingback: eLearn @ UCalgary

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: