Friendly chatbots sacrifice accuracy for empathy, study finds
Training chatbots to adopt a warmer, more empathetic communication style may systematically reduce the accuracy of their responses and increase their tendency to agree with users, even when those users are wrong. This conclusion was reached by researchers from the University of Oxford, whose study was published in Nature, Qazinform News Agency reports.
The study highlights that developers are increasingly optimizing language models not only for usefulness and safety, but also for “character,” including friendliness, empathy, and the ability to build an emotional connection with users. However, such “personalization” comes at a measurable cost to accuracy.
Experiments on five models with different architectures showed that after additional training for a “warm” style, error rates increased by 10 to 30 percentage points. The models were more likely to provide incorrect factual answers, make mistakes in medical advice, and support conspiracy claims.
Accuracy declined further in emotionally charged interactions. When users expressed feelings, particularly sadness, the gap between “original” and “warm” models reached nearly 12 percentage points.
In addition, the study also found an increase in so-called “sycophancy,” the tendency to agree with users regardless of whether their statements are true. On average, such models are about 40% more likely to validate incorrect beliefs.
At the same time, the drop in accuracy is not linked to an overall degradation of model capabilities. On standard benchmarks for knowledge and reasoning, warm models perform comparably to their original versions. This suggests that the issue is selective, with models beginning to sacrifice factual correctness in favor of maintaining a “comfortable” interaction.
“Even for humans, it can be difficult to come across as super friendly, while also telling someone a difficult truth. When we train AI chatbots to prioritize warmth, they might make mistakes they otherwise wouldn’t,” said lead author Lujain Ibrahim.
Control experiments confirmed that the key factor behind the decline is specifically training for “warmth.” Models trained in a more neutral or “cold” style did not show similar drops in accuracy and, in some cases, even improved performance.
Earlier, Qazinform News Agency reported that AI chatbots that mimic empathy may pose risks to users’ mental health, particularly among emotionally vulnerable individuals.