Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Sep 22, 2023

Cheng Shi, Sibei Yang

Figure 1 for LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Figure 2 for LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Figure 3 for LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Figure 4 for LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Share this with someone who'll enjoy it:

Abstract:Prompt engineering is a powerful tool used to enhance the performance of pre-trained models on downstream tasks. For example, providing the prompt "Let's think step by step" improved GPT-3's reasoning accuracy to 63% on MutiArith while prompting "a photo of" filled with a class name enables CLIP to achieve $80$\% zero-shot accuracy on ImageNet. While previous research has explored prompt learning for the visual modality, analyzing what constitutes a good visual prompt specifically for image recognition is limited. In addition, existing visual prompt tuning methods' generalization ability is worse than text-only prompting tuning. This paper explores our key insight: synthetic text images are good visual prompts for vision-language models! To achieve that, we propose our LoGoPrompt, which reformulates the classification objective to the visual prompt selection and addresses the chicken-and-egg challenge of first adding synthetic text images as class-wise visual prompts or predicting the class first. Without any trainable visual prompt parameters, experimental results on 16 datasets demonstrate that our method consistently outperforms state-of-the-art methods in few-shot learning, base-to-new generalization, and domain generalization.

* ICCV 2023; Project Page:https://chengshiest.github.io/logo

View paper on

Share this with someone who'll enjoy it:

Title:LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models

Paper and Code