Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Noise-Tolerant Unsupervised Adapter for Vision-Language Models

Sep 26, 2023

Eman Ali, Dayan Guan, Shijian Lu, Abdulmotaleb Elsaddik

Figure 1 for Noise-Tolerant Unsupervised Adapter for Vision-Language Models

Figure 2 for Noise-Tolerant Unsupervised Adapter for Vision-Language Models

Figure 3 for Noise-Tolerant Unsupervised Adapter for Vision-Language Models

Figure 4 for Noise-Tolerant Unsupervised Adapter for Vision-Language Models

Share this with someone who'll enjoy it:

Abstract:Recent advances in large-scale vision-language models have achieved very impressive performance in various zero-shot image classification tasks. While prior studies have demonstrated significant improvements by introducing few-shot labelled target samples, they still require labelling of target samples, which greatly degrades their scalability while handling various visual recognition tasks. We design NtUA, a Noise-tolerant Unsupervised Adapter that allows learning superior target models with few-shot unlabelled target samples. NtUA works as a key-value cache that formulates visual features and predicted pseudo-labels of the few-shot unlabelled target samples as key-value pairs. It consists of two complementary designs. The first is adaptive cache formation that combats pseudo-label noises by weighting the key-value pairs according to their prediction confidence. The second is pseudo-label rectification, which corrects both pair values (i.e., pseudo-labels) and cache weights by leveraging knowledge distillation from large-scale vision language models. Extensive experiments show that NtUA achieves superior performance consistently across multiple widely adopted benchmarks.

View paper on

Share this with someone who'll enjoy it:

Title:Noise-Tolerant Unsupervised Adapter for Vision-Language Models

Paper and Code