Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Dec 06, 2024

Hyesu Lim, Jinho Choi, Jaegul Choo, Steffen Schneider

Figure 1 for Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Figure 2 for Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Figure 3 for Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Figure 4 for Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Share this with someone who'll enjoy it:

Abstract:Adapting foundation models for specific purposes has become a standard approach to build machine learning systems for downstream applications. Yet, it is an open question which mechanisms take place during adaptation. Here we develop a new Sparse Autoencoder (SAE) for the CLIP vision transformer, named PatchSAE, to extract interpretable concepts at granular levels (e.g. shape, color, or semantics of an object) and their patch-wise spatial attributions. We explore how these concepts influence the model output in downstream image classification tasks and investigate how recent state-of-the-art prompt-based adaptation techniques change the association of model inputs to these concepts. While activations of concepts slightly change between adapted and non-adapted models, we find that the majority of gains on common adaptation tasks can be explained with the existing concepts already present in the non-adapted foundation model. This work provides a concrete framework to train and use SAEs for Vision Transformers and provides insights into explaining adaptation mechanisms.

* A demo is available at github.com/dynamical-inference/patchsae

View paper on

Share this with someone who'll enjoy it:

Title:Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Paper and Code