Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Aug 22, 2022

Yinghui Xing, Qirui Wu, De Cheng, Shizhou Zhang, Guoqiang Liang, Yanning Zhang

Figure 1 for Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Figure 2 for Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Figure 3 for Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Figure 4 for Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Share this with someone who'll enjoy it:

Abstract:With the emergence of large pre-trained vison-language model like CLIP, transferrable representations can be adapted to a wide range of downstream tasks via prompt tuning. Prompt tuning tries to probe the beneficial information for downstream tasks from the general knowledge stored in both the image and text encoders of the pre-trained vision-language model. A recently proposed method named Context Optimization (CoOp) introduces a set of learnable vectors as text prompt from the language side, while tuning the text prompt alone can not affect the computed visual features of the image encoder, thus leading to sub-optimal. In this paper, we propose a dual modality prompt tuning paradigm through learning text prompts and visual prompts for both the text and image encoder simultaneously. In addition, to make the visual prompt concentrate more on the target visual concept, we propose Class-Aware Visual Prompt Tuning (CAVPT), which is generated dynamically by performing the cross attention between language descriptions of template prompts and visual class token embeddings. Our method provides a new paradigm for tuning the large pre-trained vision-language model and extensive experimental results on 8 datasets demonstrate the effectiveness of the proposed method. Our code is available in the supplementary materials.

* 9 pages, 4 figures

View paper on

Share this with someone who'll enjoy it:

Title:Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Paper and Code