Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Mar 15, 2023

Potsawee Manakul, Adian Liusie, Mark J. F. Gales

Figure 1 for SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Figure 2 for SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Figure 3 for SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Figure 4 for SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Share this with someone who'll enjoy it:

Abstract:Generative Large Language Models (LLMs) such as GPT-3 are capable of generating highly fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate facts and make non-factual statements which can undermine trust in their output. Existing fact-checking approaches either require access to token-level output probability distribution (which may not be available for systems such as ChatGPT) or external databases that are interfaced via separate, often complex, modules. In this work, we propose "SelfCheckGPT", a simple sampling-based approach that can be used to fact-check black-box models in a zero-resource fashion, i.e. without an external database. SelfCheckGPT leverages the simple idea that if a LLM has knowledge of a given concept, sampled responses are likely to be similar and contain consistent facts. However, for hallucinated facts, stochastically sampled responses are likely to diverge and contradict one another. We investigate this approach by using GPT-3 to generate passages about individuals from the WikiBio dataset, and manually annotate the factuality of the generated passages. We demonstrate that SelfCheckGPT can: i) detect non-factual and factual sentences; and ii) rank passages in terms of factuality. We compare our approach to several existing baselines and show that in sentence hallucination detection, our approach has AUC-PR scores comparable to grey-box methods, while SelfCheckGPT is best at passage factuality assessment.

View paper on

Share this with someone who'll enjoy it:

Title:SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Paper and Code