OpenAI study reveals why AI models “hallucinate”

OpenAI has released a new research paper examining why large language models sometimes produce confident but incorrect answers — a phenomenon known as “hallucinations,” Kazinform News Agency correspondent reports.

OpenAI shallucinates
Collage credit: Canva

Why hallucinations occur

According to the paper, the main reason lies in how models are currently trained and evaluated. Standard benchmarks reward accuracy but provide no incentive for admitting uncertainty. As a result, models are more likely to “guess” when unsure, since a lucky guess may score points, while saying “I don’t know” always receives zero.

This system encourages confident errors, which appear convincing but are factually wrong. OpenAI argues that this dynamic is a key driver behind hallucinations in AI systems, even as models become more advanced.

Researchers tested popular chatbots on seemingly simple factual questions, such as the title of a researcher’s PhD dissertation or his date of birth. The models produced multiple answers, all incorrect, but delivered them with confidence.

Data comparisons highlight the trade-off: the older o4-mini model achieved slightly higher accuracy (24%) but also produced errors 75% of the time. By contrast, the newer GPT-5-thinking-mini model showed slightly lower accuracy (22%) but far fewer errors (26%), because it abstained from guessing in over half of the cases.

Origins of errors

The paper also explains why factual mistakes are harder to eliminate than other errors, such as spelling. During pretraining, language models learn by predicting the next word in vast amounts of text. While patterns in grammar can be learned reliably, low-frequency facts, such as personal birthdays, lack consistent patterns, making them more error-prone.

The study suggests revising evaluation standards so that confident errors are penalized more heavily than admissions of uncertainty. Partial credit could be given when models indicate they do not know an answer. This approach, researchers argue, would encourage safer and more reliable behavior, similar to some standardized exams where wrong answers lower scores but skipped questions do not.

OpenAI concludes that hallucinations cannot be eliminated entirely, since some questions are inherently unanswerable. However, they can be significantly reduced by changing evaluation methods and encouraging models to abstain when uncertain.

Earlier, Kazinform News Agency reported on how reliable are AI book summaries.

Most popular
See All