Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Exploring Question Decomposition for Zero-Shot VQA

Oct 25, 2023

Zaid Khan, Vijay Kumar BG, Samuel Schulter, Manmohan Chandraker, Yun Fu

Figure 1 for Exploring Question Decomposition for Zero-Shot VQA

Figure 2 for Exploring Question Decomposition for Zero-Shot VQA

Figure 3 for Exploring Question Decomposition for Zero-Shot VQA

Figure 4 for Exploring Question Decomposition for Zero-Shot VQA

Share this with someone who'll enjoy it:

Abstract:Visual question answering (VQA) has traditionally been treated as a single-step task where each question receives the same amount of effort, unlike natural human question-answering strategies. We explore a question decomposition strategy for VQA to overcome this limitation. We probe the ability of recently developed large vision-language models to use human-written decompositions and produce their own decompositions of visual questions, finding they are capable of learning both tasks from demonstrations alone. However, we show that naive application of model-written decompositions can hurt performance. We introduce a model-driven selective decomposition approach for second-guessing predictions and correcting errors, and validate its effectiveness on eight VQA tasks across three domains, showing consistent improvements in accuracy, including improvements of >20% on medical VQA datasets and boosting the zero-shot performance of BLIP-2 above chance on a VQA reformulation of the challenging Winoground task. Project Site: https://zaidkhan.me/decomposition-0shot-vqa/

* NeurIPS 2023 Camera Ready

View paper on

Share this with someone who'll enjoy it:

Title:Exploring Question Decomposition for Zero-Shot VQA

Paper and Code