Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Sep 09, 2022

Shakeeb Murtaza, Soufiane Belharbi, Marco Pedersoli, Aydin Sarraf, Eric Granger

Figure 1 for Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Figure 2 for Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Figure 3 for Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Figure 4 for Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Share this with someone who'll enjoy it:

Abstract:Self-supervised vision transformers can generate accurate localization maps of the objects in an image. However, since they decompose the scene into multiple maps containing various objects, and they do not rely on any explicit supervisory signal, they cannot distinguish between the object of interest from other objects, as required in weakly-supervised object localization (WSOL). To address this issue, we propose leveraging the multiple maps generated by the different transformer heads to acquire pseudo-labels for training a WSOL model. In particular, a new Discriminative Proposals Sampling (DiPS) method is introduced that relies on a pretrained CNN classifier to identify discriminative regions. Then, foreground and background pixels are sampled from these regions in order to train a WSOL model for generating activation maps that can accurately localize objects belonging to a specific class. Empirical results on the challenging CUB, OpenImages, and ILSVRC benchmark datasets indicate that our proposed approach can outperform state-of-art methods over a wide range of threshold values. DiPS provides class activation maps with a better coverage of foreground object regions w.r.t. the background.

View paper on

Share this with someone who'll enjoy it:

Title:Discriminative Sampling of Proposals in Self-Supervised Transformers for Weakly Supervised Object Localization

Paper and Code