Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Tokens on Demand: Token Condensation as Training-free Test-time Adaptation

Oct 16, 2024

Zixin Wang, Dong Gong, Sen Wang, Zi Huang, Yadan Luo

Figure 1 for Tokens on Demand: Token Condensation as Training-free Test-time Adaptation

Figure 2 for Tokens on Demand: Token Condensation as Training-free Test-time Adaptation

Figure 3 for Tokens on Demand: Token Condensation as Training-free Test-time Adaptation

Figure 4 for Tokens on Demand: Token Condensation as Training-free Test-time Adaptation

Share this with someone who'll enjoy it:

Abstract:In this work, we introduce Token Condensation as Adaptation (TCA), a training-free approach designed to mitigate distribution shifts encountered by vision-language models (VLMs) during test-time inference. TCA bridges distribution gaps at the patch level by condensing image tokens that exhibit low attentiveness to the <cls> token. Recognizing the <cls> token may correspond to universal concepts, TCA identifies and tracks the most reliable <cls> tokens that align specifically with target classes from historical data streams. To achieve this, we propose a context token reservoir (CTR), which retains tokens with the lowest uncertainty as ``anchors" to guide the preservation of class-relevant tokens during inference. These anchors, in turn, act as token-level classifiers to correct VLM predictions and improve visual-text alignment. Utilizing anchors sampled from CTR, TCA condenses tokens through two operations: (1) pruning class-irrelevant tokens that consistently rank low across all attention heads to reach cross-head consensus on their irrelevance, and (2) merging the remaining class-ambiguous tokens into representative centers using coreset selection, maintaining linear computational complexity. As the first method to explore token efficiency in test-time adaptation, TCA consistently demonstrates superior performance across cross-dataset and out-of-distribution adaptation tasks, reducing GFLOPs by 12.2% to 48.9% while achieving accuracy improvements up to 21.4% against the strongest baseline without introducing additional parameters.

* 18 pages, 7 figures

View paper on

Share this with someone who'll enjoy it:

Title:Tokens on Demand: Token Condensation as Training-free Test-time Adaptation

Paper and Code