Noting that Bing may soon integrate ChatGPT (Microsoft aims for AI-powered version of Bing – The Information [Reuters]), we can only hope they sort out how URLs are parsed…
It got the PM wrong, but perhaps that’s because its training biases it to Johnson?
My querying is really sloppy here, and doesn’t really check whether ChatGPT is getting content from the page or not… Which in part goes to show how beguiling all this stuff can be and how it’s so easy to to make so many assumptions, as the apparent fit of the responses to the prompts takes you along with it (as you’d expect: the model chucks out the next token based on what it’s likely to be given all the historical training sentences that have been used to build the model).
Okay, so maybe it isn’t reading the page, it’s just parsing the URL and using the words from the page slug to prompt the faux summary? [That said, as Phil Bradley pointed out in a comment, the name of the PM isn’t actually mentioned in the linked to post. Also, as @arosha pointed out, the maths thing has been trailled in at least one news report from August 2022, although that is past the model cut-off point.] Let’s try it with a made-up URL:
Okay, so it seems to claim that it doesn’t recognise that URL. David Kane tried something less contentious, and did get a response based around a different made up URL:
So maybe the “plausibility” of the URL is relevant?
With a bit of fiddling, we can get a response where ChatGPT implies it can’t access the web:
If we are referring to URLs in Bing prompts, and the search engine is coming up with responses based on page indexes, whereas the ChatGPT component is hallucinating indexes based on the prompt and the terms in the URL, then, erm…, WTF? (For a quick take on current search engine + GPT3 integrations, see Combining GPT3 and Web Search — Perplexity.ai and lexi.ai.)
Elsewhere in the blogoverse, I notice that D’Arcy has also been playing with ChatGPT — ChatGPT Designs a University Learning Space Design Evaluation Plan — and spotted that ChatGPT is happy to make up plausible sounding but non-existent citations to strengthen the form of its response.
I’ve noticed that when trying to get ChatGPT to make up references (eg Information Literacy and Generating Fake Citations and Abstracts With ChatGPT), it often uses actual (and relevant) journal titles, the names of actual authors (and author combinations) and plausible titles. So… I wonder.. If ChatGPT makes up a citation claiming me as the author in some sort of plausible context, and an author then includes it in a published, peer reviewed work from a commercial publisher, and I get taken through an academic disciplinary committee because some sort of citation harvesting engine has picked up the fake citation and that citation harvester output somehow finds it way back into a reputation manageet system my institution is using and I am “rumbled” for making up fake citations, who do I sue?
I’ve noticed that ChatGPT does have post-processor filters that can flag content warnings, so should it also be providing an optional “fake citation” filter to highlight fake citations? There could also be value in identifying real authors and the “sort of” paper title they might publish, or the sort of journal they are likely to publish in, even if the actual paper (or even the actual journal) doesn’t exist. Do citation managers such as Zotero provide existence check tools so users can check that a citation actual exists, rather than just ensuring stylistic correctness for a particular citation format?
If Bing is to incorporate ChatGPT, and generate novel texts as well as returning links to third party texts, how will it filter out generated responses that are essentially bullshit? Particularly if it is rating or ranking the generated response (which is generated from indexed content) against the content pages that contributed to the underlying model?
And finally, there has been a reasonable amount of traffic on the wires with folk asking about what the effect on education and assessment is likely to be. Whilst “everyone” has been talking about ChatGPT, I suspect most people haven’t, and even fewer have signed up to play with it. If ChatGPT gets incorporated into Bing (or Google incporporates its own LLM into Google search), then the content will be just another content option for students pasting questions into the search box to copy and paste from. More “deliberate” use might result from incorporation into MS Word, eg as a Grammarly service [hmm, I wonder what percentage of OUr students use Grammarly, and whether we can detect its use?].
PS Thinks: just like Amazon spots popular products from its search and sales logs and then releases undercutting or competitively priced and highly ranked own-brand alternatives, is it hard to imagine a search engine that uses something like Common Crawl for a base level of web search, but also mints URLs and auto-generates content pages on-the-fly in response to queries that it (legitimately) ranks highly and pops a few ads onto, to give the appearance that the result is on a “legitimate” web page?
PPS TIme to read Richard Gregory’s Mind In Science again, I think, and wonder what he would have thought about LLMs…