Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Mar 12, 2024

Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He

Figure 1 for In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Figure 2 for In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Figure 3 for In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Figure 4 for In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Share this with someone who'll enjoy it:

Abstract:Large language models (LLMs) frequently hallucinate and produce factual errors, yet our understanding of why they make these errors remains limited. In this study, we delve into the underlying mechanisms of LLM hallucinations from the perspective of inner representations, and discover a salient pattern associated with hallucinations: correct generations tend to have sharper context activations in the hidden states of the in-context tokens, compared to the incorrect ones. Leveraging this insight, we propose an entropy-based metric to quantify the ``sharpness'' among the in-context hidden states and incorporate it into the decoding process to formulate a constrained decoding approach. Experiments on various knowledge-seeking and hallucination benchmarks demonstrate our approach's consistent effectiveness, for example, achieving up to an 8.6 point improvement on TruthfulQA. We believe this study can improve our understanding of hallucinations and serve as a practical solution for hallucination mitigation.

* code repo is available at: https://github.com/hkust-nlp/Activation_decoding.git

View paper on

Share this with someone who'll enjoy it:

Title:In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

Paper and Code