Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Verbalized Representation Learning for Interpretable Few-Shot Generalization

Nov 27, 2024

Cheng-Fu Yang, Da Yin, Wenbo Hu, Nanyun Peng, Bolei Zhou, Kai-Wei Chang

Figure 1 for Verbalized Representation Learning for Interpretable Few-Shot Generalization

Figure 2 for Verbalized Representation Learning for Interpretable Few-Shot Generalization

Figure 3 for Verbalized Representation Learning for Interpretable Few-Shot Generalization

Figure 4 for Verbalized Representation Learning for Interpretable Few-Shot Generalization

Share this with someone who'll enjoy it:

Abstract:Humans recognize objects after observing only a few examples, a remarkable capability enabled by their inherent language understanding of the real-world environment. Developing verbalized and interpretable representation can significantly improve model generalization in low-data settings. In this work, we propose Verbalized Representation Learning (VRL), a novel approach for automatically extracting human-interpretable features for object recognition using few-shot data. Our method uniquely captures inter-class differences and intra-class commonalities in the form of natural language by employing a Vision-Language Model (VLM) to identify key discriminative features between different classes and shared characteristics within the same class. These verbalized features are then mapped to numeric vectors through the VLM. The resulting feature vectors can be further utilized to train and infer with downstream classifiers. Experimental results show that, at the same model scale, VRL achieves a 24% absolute improvement over prior state-of-the-art methods while using 95% less data and a smaller mode. Furthermore, compared to human-labeled attributes, the features learned by VRL exhibit a 20% absolute gain when used for downstream classification tasks. Code is available at: https://github.com/joeyy5588/VRL/tree/main.

View paper on

Share this with someone who'll enjoy it:

Title:Verbalized Representation Learning for Interpretable Few-Shot Generalization

Paper and Code