Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianyang Zhang

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Mar 26, 2025

Jianyang Zhang, Qianli Luo, Guowu Yang, Wenjing Yang, Weide Liu, Guosheng Lin, Fengmao Lv

Abstract:Language Bottleneck Models (LBMs) are proposed to achieve interpretable image recognition by classifying images based on textual concept bottlenecks. However, current LBMs simply list all concepts together as the bottleneck layer, leading to the spurious cue inference problem and cannot generalized to unseen classes. To address these limitations, we propose the Attribute-formed Language Bottleneck Model (ALBM). ALBM organizes concepts in the attribute-formed class-specific space, where concepts are descriptions of specific attributes for specific classes. In this way, ALBM can avoid the spurious cue inference problem by classifying solely based on the essential concepts of each class. In addition, the cross-class unified attribute set also ensures that the concept spaces of different classes have strong correlations, as a result, the learned concept classifier can be easily generalized to unseen classes. Moreover, to further improve interpretability, we propose Visual Attribute Prompt Learning (VAPL) to extract visual features on fine-grained attributes. Furthermore, to avoid labor-intensive concept annotation, we propose the Description, Summary, and Supplement (DSS) strategy to automatically generate high-quality concept sets with a complete and precise attribute. Extensive experiments on 9 widely used few-shot benchmarks demonstrate the interpretability, transferability, and performance of our approach. The code and collected concept sets are available at https://github.com/tiggers23/ALBM.

* This paper has been accepted to CVPR 2025

Via

Access Paper or Ask Questions

Learning Cross-domain Semantic-Visual Relation for Transductive Zero-Shot Learning

Mar 31, 2020

Jianyang Zhang, Fengmao Lv, Guowu Yang, Lei Feng, Yufeng Yu, Lixin Duan

Figure 1 for Learning Cross-domain Semantic-Visual Relation for Transductive Zero-Shot Learning

Figure 2 for Learning Cross-domain Semantic-Visual Relation for Transductive Zero-Shot Learning

Figure 3 for Learning Cross-domain Semantic-Visual Relation for Transductive Zero-Shot Learning

Figure 4 for Learning Cross-domain Semantic-Visual Relation for Transductive Zero-Shot Learning

Abstract:Zero-Shot Learning (ZSL) aims to learn recognition models for recognizing new classes without labeled data. In this work, we propose a novel approach dubbed Transferrable Semantic-Visual Relation (TSVR) to facilitate the cross-category transfer in transductive ZSL. Our approach draws on an intriguing insight connecting two challenging problems, i.e. domain adaptation and zero-shot learning. Domain adaptation aims to transfer knowledge across two different domains (i.e., source domain and target domain) that share the identical task/label space. For ZSL, the source and target domains have different tasks/label spaces. Hence, ZSL is usually considered as a more difficult transfer setting compared with domain adaptation. Although the existing ZSL approaches use semantic attributes of categories to bridge the source and target domains, their performances are far from satisfactory due to the large domain gap between different categories. In contrast, our method directly transforms ZSL into a domain adaptation task through redrawing ZSL as predicting the similarity/dissimilarity labels for the pairs of semantic attributes and visual features. For this redrawn domain adaptation problem, we propose to use a domain-specific batch normalization component to reduce the domain discrepancy of semantic-visual pairs. Experimental results over diverse ZSL benchmarks clearly demonstrate the superiority of our method.

Via

Access Paper or Ask Questions