Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Feb 11, 2025

Muqing Miao, Michael Kearns

Figure 1 for Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Figure 2 for Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Figure 3 for Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Figure 4 for Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Share this with someone who'll enjoy it:

Abstract:Recent theoretical work by [Kalai and Vempala 2024] proves that a particular notion of hallucination rate in LLMs must be lower bounded by the training data monofact rate (related to the classical Good-Turing missing mass estimator) minus model miscalibration. Through systematic experiments with n-gram models and in-context learning with LLMs, we empirically investigate and validate this theory by examining how different underlying data distributions affect the monofact rate and a model's tendency to hallucinate. We then vary model miscalibration through controlled upweighting of training samples while holding monofact rates constant, allowing us to isolate miscalibration's reduction effect on hallucination. These findings suggest that both the distribution of fact frequencies in training data and the calibration-hallucination trade-off are inherent to probabilistic language generation. Our results also suggest that current practices of aggressive deduplication in training data may need to be reconsidered, as selective duplication could serve as a principled mechanism for reducing hallucination.

* Code available at https://github.com/mmiao2/Hallucination.git

View paper on

Share this with someone who'll enjoy it:

Title:Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Paper and Code