Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Jan 16, 2024

Qi Bi, Wei Ji, Jingjun Yi, Haolan Zhan, Gui-Song Xia

Figure 1 for Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Figure 2 for Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Figure 3 for Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Figure 4 for Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Share this with someone who'll enjoy it:

Abstract:High-quality annotation of fine-grained visual categories demands great expert knowledge, which is taxing and time consuming. Alternatively, learning fine-grained visual representation from enormous unlabeled images (e.g., species, brands) by self-supervised learning becomes a feasible solution. However, recent researches find that existing self-supervised learning methods are less qualified to represent fine-grained categories. The bottleneck lies in that the pre-text representation is built from every patch-wise embedding, while fine-grained categories are only determined by several key patches of an image. In this paper, we propose a Cross-level Multi-instance Distillation (CMD) framework to tackle the challenge. Our key idea is to consider the importance of each image patch in determining the fine-grained pre-text representation by multiple instance learning. To comprehensively learn the relation between informative patches and fine-grained semantics, the multi-instance knowledge distillation is implemented on both the region/image crop pairs from the teacher and student net, and the region-image crops inside the teacher / student net, which we term as intra-level multi-instance distillation and inter-level multi-instance distillation. Extensive experiments on CUB-200-2011, Stanford Cars and FGVC Aircraft show that the proposed method outperforms the contemporary method by upto 10.14% and existing state-of-the-art self-supervised learning approaches by upto 19.78% on both top-1 accuracy and Rank-1 retrieval metric.

* work in progress

View paper on

Share this with someone who'll enjoy it:

Title:Cross-Level Multi-Instance Distillation for Self-Supervised Fine-Grained Visual Categorization

Paper and Code