AI struggles with less common data: Inconsistent results for Valletta Bastions (actual mean height: 25m) highlight issues with insufficient training data. We also touch on AI poisoning.
https://www.alanbonnici.com/2025/03/ai-got-it-wrong-missing-information-or.html
@timnitGebru @emilymbender
Yup, today I did a search for something, and an #LLM summary / result combined an article I wrote with another, completely unrelated article, and provided a bullshit response that appeared factual.
Admittedly, it did give my site credit alongside the other article, but NOW, the problem is that it’s smearing my reputation as my article is correct, but the summary is wrong — implying that I’m the one that fucked up.
AI Got It Wrong - Prime Numbers
See how 6 AI engines respond to the question "Is 3821 a prime number?" followed by "What is the next prime number greater than 3821? " Compare speed, accuracy and each one's computational reasoning.
https://www.alanbonnici.com/2025/03/ai-got-it-wrong-prime-numbers.html
It’s all hallucinations
The discourse on “AI” systems, chat bots, “assistants” and “research helpers” is defined by a lot of future promises. Those systems are disfunctional or at least not working great right now but there’s the promise of things getting better in the future.
Which is how we often perceive tech to work: Early versions might be a bit wonky, but there’s constant iteration and work going on to improve systems to be more capable, more robust and maybe even cheaper at some point.
The most pressing problem for many modern “AI” systems, especially the generative systems that are all the rage these days are so-called “hallucinations” which is a term describing when an AI system generates incorrect information. Think a research agent inventing a paper to quote from that doesn’t exist for example (Google’s AI assistant telling you to put glue on pizza is not a hallucination in that regard because that is just regurgitating information from Reddit that every toddler would recognize as a joke). Hallucinations are the big issue that many researchers are trying to address – which mixed results. Methods like RAG are shifting the probabilities a bit but are still not solving the problem: Hallucinations keep happening.
But I think that this discourse misses an important thing: Anything an LLM generates is a hallucination.
That doesn’t mean that everything LLMs generate is incorrect, far from it. What I am referencing is what hallucinations are actually defined as: A hallucination is a perception you have that is not connected to any actual stimulus. You hallucinate when you perceive something in the world that you have no sensor data for.
The term hallucination itself is an anthropomorphization of those statistical systems. They don’t “know”, or “think” or “lie” or do any such things. They iteratively calculate the most probable set of words and characters based on the original data. But if we look at how it is applied to “AI”s I think there is a big misunderstanding because it creates a difference between true and false statements that just isn’t there.
For humans we separate “real perceptions” from hallucinations by the link to sensor data/stimulants: If there is an actual stimulant of you feeling a touch it’s real, if you just think you are being touched, it’s a hallucination. But for LLMs that distinction is meaningless.
A line of text that is true has – for the LLM – absolutely no different quality than one that is false. There is no long to reality, no sensor data or anchoring, there’s just the data one was trained on (that also doesn’t necessarily have any connection to reality). If using the term hallucination is useful to describe LLM output it is to illustrate the quality of all output. Everything an LLM generates is a hallucination, some might accidentally be true.
And in that understanding the terminology might actually be enlightening, might actually help people understand what those systems are doing and where it might be appropriate to use and – more importantly – where not.
Liked it? Take a second to support tante on Patreon!This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Burn your village to the ground!
AI Got It Wrong - Prime Numbers
See how 6 AI engines respond to the question "Is 3821 a prime number?" followed by "What is the next prime number greater than 3821? " Compare speed, accuracy and each one's computational reasoning.
Denmark’s Sissal will bring dance-banger “Hallucination” to the London Eurovision Party 2025 https://www.byteseu.com/801903/ #Denmark #Hallucination #LEP #LEP2025 #LondonEurovisionParty #LondonEurovisionParty2025 #Sissal
#ITByte: #AI #Hallucination is a phenomenon wherein a large language model (LLM) - often a generative AI chatbot or computer vision tool - perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.
https://knowledgezone.co.in/posts/AI-Hallucination-67bdb515f517bc50960ffc57
#AI #Hallucination is a phenomenon wherein a large language model (LLM) - often a generative AI chatbot or computer vision tool - perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.
https://knowledgezone.co.in/posts/AI-Hallucination-67bdb515f517bc50960ffc57
#AI #Hallucination or #Overthinking? How many vowels are in Alabama?
I asked the Alabama question to different AI Apps and AI models.
https://www.lotharschulz.info/2025/02/20/hallucination-or-overthinking/
Lawyers Caught Citing #AI-Hallucinated Cases Say It’s a ‘Cautionary Tale’ For All Law Firms
"The attorneys filed court documents referencing eight non-existent cases, then admitted it was a "#hallucination" by an AI tool."
https://www.courtwatch.news/p/lawyers-caught-citing-ai-hallucinated-cases-say-it-s-a-cautionary-tale-for-all-law-firms
How do our brains know what’s real?
From seeing things to hearing voices, there’s a finer line between #hallucination and #reality than you might suppose
Dual-minded #LLMs outperformed prompting techniques designed to mitigate #hallucination.
Solution time decreased faster than performance as models were given more leeway to switch from Monte Carlo Tree Search (MCTS) to standard inference.