Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Jun 16, 2023

Rabiul Awal, Le Zhang, Aishwarya Agrawal

Figure 1 for Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Figure 2 for Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Figure 3 for Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Figure 4 for Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Share this with someone who'll enjoy it:

Abstract:Visual question answering (VQA) is a challenging task that requires the ability to comprehend and reason with visual information. While recent vision-language models have made strides, they continue to struggle with zero-shot VQA, particularly in handling complex compositional questions and adapting to new domains i.e. knowledge-based reasoning. This paper explores the use of various prompting strategies, focusing on the BLIP2 model, to enhance zero-shot VQA performance. We conduct a comprehensive investigation across several VQA datasets, examining the effectiveness of different question templates, the role of few-shot exemplars, the impact of chain-of-thought (CoT) reasoning, and the benefits of incorporating image captions as additional visual cues. Despite the varied outcomes, our findings demonstrate that carefully designed question templates and the integration of additional visual cues, like image captions, can contribute to improved VQA performance, especially when used in conjunction with few-shot examples. However, we also identify a limitation in the use of chain-of-thought rationalization, which negatively affects VQA accuracy. Our study thus provides critical insights into the potential of prompting for improving zero-shot VQA performance.

View paper on

Share this with someone who'll enjoy it:

Title:Investigating Prompting Techniques for Zero- and Few-Shot Visual Question Answering

Paper and Code