Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Jun 12, 2023

Karsten Roth, Jae Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata

Figure 1 for Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Figure 2 for Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Figure 3 for Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Figure 4 for Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Share this with someone who'll enjoy it:

Abstract:The visual classification performance of vision-language models such as CLIP can benefit from additional semantic knowledge, e.g. via large language models (LLMs) such as GPT-3. Further extending classnames with LLM-generated class descriptors, e.g. ``waffle, \textit{which has a round shape}'', or averaging retrieval scores over multiple such descriptors, has been shown to improve generalization performance. In this work, we study this behavior in detail and propose \texttt{Waffle}CLIP, a framework for zero-shot visual classification which achieves similar performance gains on a large number of visual classification tasks by simply replacing LLM-generated descriptors with random character and word descriptors \textbf{without} querying external models. We extend these results with an extensive experimental study on the impact and shortcomings of additional semantics introduced via LLM-generated descriptors, and showcase how semantic context is better leveraged by automatically querying LLMs for high-level concepts, while jointly resolving potential class name ambiguities. Link to the codebase: https://github.com/ExplainableML/WaffleCLIP.

View paper on

Share this with someone who'll enjoy it:

Title:Waffling around for Performance: Visual Classification with Random Words and Broad Concepts

Paper and Code