Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data

Aug 30, 2024

Spencer Whitehead, Jacob Phillips, Sean Hendryx

Figure 1 for Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data

Figure 2 for Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data

Figure 3 for Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data

Figure 4 for Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data

Share this with someone who'll enjoy it:

Abstract:Multimodal language models can exhibit hallucinations in their outputs, which limits their reliability. The ability to automatically detect these errors is important for mitigating them, but has been less explored and existing efforts do not localize hallucinations, instead framing this as a classification task. In this work, we first pose multimodal hallucination detection as a sequence labeling task where models must localize hallucinated text spans and present a strong baseline model. Given the high cost of human annotations for this task, we propose an approach to improve the sample efficiency of these models by creating corrupted grounding data, which we use for pre-training. Leveraging phrase grounding data, we generate hallucinations to replace grounded spans and create hallucinated text. Experiments show that pre-training on this data improves sample efficiency when fine-tuning, and that the learning signal from the grounding data plays an important role in these improvements.

View paper on

Share this with someone who'll enjoy it:

Title:Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data

Paper and Code