Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Sep 13, 2023

Bill Psomas, Ioannis Kakogeorgiou, Konstantinos Karantzalos, Yannis Avrithis

Figure 1 for Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Figure 2 for Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Figure 3 for Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Figure 4 for Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Share this with someone who'll enjoy it:

Abstract:Convolutional networks and vision transformers have different forms of pairwise interactions, pooling across layers and pooling at the end of the network. Does the latter really need to be different? As a by-product of pooling, vision transformers provide spatial attention for free, but this is most often of low quality unless self-supervised, which is not well studied. Is supervision really the problem? In this work, we develop a generic pooling framework and then we formulate a number of existing methods as instantiations. By discussing the properties of each group of methods, we derive SimPool, a simple attention-based pooling mechanism as a replacement of the default one for both convolutional and transformer encoders. We find that, whether supervised or self-supervised, this improves performance on pre-training and downstream tasks and provides attention maps delineating object boundaries in all cases. One could thus call SimPool universal. To our knowledge, we are the first to obtain attention maps in supervised transformers of at least as good quality as self-supervised, without explicit losses or modifying the architecture. Code at: https://github.com/billpsomas/simpool.

* International Conference on Computer Vision (2023) * ICCV 2023. Code and models: https://github.com/billpsomas/simpool

View paper on

Share this with someone who'll enjoy it:

Title:Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

Paper and Code