Yeah, it’s what they do. Generate convincing text. Calling it “errors” makes as much sense as claiming my dice “produced errors” when I lost at yahtzee.
An illustrative example: https://kucharski.substack.com/p/real-signals-or-artificial-stereotypes
"First, I’d created 2000 free-text responses and labelled them ‘UK’. Then I copied and pasted the exact same 2000 responses but labelled these ‘US’. Finally, I combined them to create a dataset of 4000 total responses, and jumbled them up.
Despite the responses being identical for the UK and US, Copilot produced a rich, detailed summary of how US and UK respondents differed."
LLMs making “errors” around elections might be the exact point of them.
Researchers tested how the services ChatGPT, Google Gemini, Google AI Overviews, Grok and Replika performed on a single day during the 2026 Scottish pre-election window and found:
- One third (34.1%) of responses across chatbots contained factual errors, whilst reliability varied significantly across services
- Errors included getting the date of election day wrong, giving wrong information about the need for voters to bring ID, “hallucinating” a candidate, and making up an expenses scandal on one occasion, and a nepotism scandal on another.
We reveal new evidence of the scale of these services’ unreliability during elections and make recommendations for the government to close the regulatory gap.
The last bit is the most important imo: Chatbots must not be allowed to present themselves as providers of information. Nor should any commercial/official body be allowed to rely on them.




