Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Nov 05, 2024

Yunkai Dang, Mengxi Gao, Yibo Yan, Xin Zou, Yanggan Gu, Aiwei Liu, Xuming Hu

Figure 1 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Figure 2 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Figure 3 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Figure 4 for Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Share this with someone who'll enjoy it:

Abstract:Ensuring that Multimodal Large Language Models (MLLMs) maintain consistency in their responses is essential for developing trustworthy multimodal intelligence. However, existing benchmarks include many samples where all MLLMs \textit{exhibit high response uncertainty when encountering misleading information}, requiring even 5-15 response attempts per sample to effectively assess uncertainty. Therefore, we propose a two-stage pipeline: first, we collect MLLMs' responses without misleading information, and then gather misleading ones via specific misleading instructions. By calculating the misleading rate, and capturing both correct-to-incorrect and incorrect-to-correct shifts between the two sets of responses, we can effectively metric the model's response uncertainty. Eventually, we establish a \textbf{\underline{M}}ultimodal \textbf{\underline{U}}ncertainty \textbf{\underline{B}}enchmark (\textbf{MUB}) that employs both explicit and implicit misleading instructions to comprehensively assess the vulnerability of MLLMs across diverse domains. Our experiments reveal that all open-source and close-source MLLMs are highly susceptible to misleading instructions, with an average misleading rate exceeding 86\%. To enhance the robustness of MLLMs, we further fine-tune all open-source MLLMs by incorporating explicit and implicit misleading data, which demonstrates a significant reduction in misleading rates. Our code is available at: \href{https://github.com/Yunkai696/MUB}{https://github.com/Yunkai696/MUB}

View paper on

Share this with someone who'll enjoy it:

Title:Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Paper and Code