livellosegreto.it is one of the many independent Mastodon servers you can use to participate in the fediverse.
Livello Segreto è il social etico che ha rispetto di te e del tuo tempo.

Administered by:

Server stats:

1.2K
active users

#hallucination

0 posts0 participants0 posts today
Replied in thread

@timnitGebru @emilymbender
Yup, today I did a search for something, and an #LLM summary / result combined an article I wrote with another, completely unrelated article, and provided a bullshit response that appeared factual.

Admittedly, it did give my site credit alongside the other article, but NOW, the problem is that it’s smearing my reputation as my article is correct, but the summary is wrong — implying that I’m the one that fucked up.

AI Got It Wrong - Prime Numbers

See how 6 AI engines respond to the question "Is 3821 a prime number?" followed by "What is the next prime number greater than 3821? " Compare speed, accuracy and each one's computational reasoning.

alanbonnici.com/2025/03/ai-got

www.alanbonnici.comAI got it wrong - Prime NumbersThis blog is about security and computing related topics with occassional hobby activities thrown in.
#TTMO#AI#ChatGPT

It’s all hallucinations

The discourse on “AI” systems, chat bots, “assistants” and “research helpers” is defined by a lot of future promises. Those systems are disfunctional or at least not working great right now but there’s the promise of things getting better in the future.

Which is how we often perceive tech to work: Early versions might be a bit wonky, but there’s constant iteration and work going on to improve systems to be more capable, more robust and maybe even cheaper at some point.

The most pressing problem for many modern “AI” systems, especially the generative systems that are all the rage these days are so-called “hallucinations” which is a term describing when an AI system generates incorrect information. Think a research agent inventing a paper to quote from that doesn’t exist for example (Google’s AI assistant telling you to put glue on pizza is not a hallucination in that regard because that is just regurgitating information from Reddit that every toddler would recognize as a joke). Hallucinations are the big issue that many researchers are trying to address – which mixed results. Methods like RAG are shifting the probabilities a bit but are still not solving the problem: Hallucinations keep happening.

But I think that this discourse misses an important thing: Anything an LLM generates is a hallucination.

That doesn’t mean that everything LLMs generate is incorrect, far from it. What I am referencing is what hallucinations are actually defined as: A hallucination is a perception you have that is not connected to any actual stimulus. You hallucinate when you perceive something in the world that you have no sensor data for.

The term hallucination itself is an anthropomorphization of those statistical systems. They don’t “know”, or “think” or “lie” or do any such things. They iteratively calculate the most probable set of words and characters based on the original data. But if we look at how it is applied to “AI”s I think there is a big misunderstanding because it creates a difference between true and false statements that just isn’t there.

For humans we separate “real perceptions” from hallucinations by the link to sensor data/stimulants: If there is an actual stimulant of you feeling a touch it’s real, if you just think you are being touched, it’s a hallucination. But for LLMs that distinction is meaningless.

A line of text that is true has – for the LLM – absolutely no different quality than one that is false. There is no long to reality, no sensor data or anchoring, there’s just the data one was trained on (that also doesn’t necessarily have any connection to reality). If using the term hallucination is useful to describe LLM output it is to illustrate the quality of all output. Everything an LLM generates is a hallucination, some might accidentally be true.

And in that understanding the terminology might actually be enlightening, might actually help people understand what those systems are doing and where it might be appropriate to use and – more importantly – where not.

Liked it? Take a second to support tante on Patreon!

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.