Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Oct 11, 2024

Nguyen Huu Bao Long, Chenyu Zhang, Yuzhi Shi, Tsubasa Hirakawa, Takayoshi Yamashita, Tohgoroh Matsui, Hironobu Fujiyoshi

Figure 1 for DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Figure 2 for DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Figure 3 for DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Figure 4 for DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Share this with someone who'll enjoy it:

Abstract:Vision Transformers with various attention modules have demonstrated superior performance on vision tasks. While using sparsity-adaptive attention, such as in DAT, has yielded strong results in image classification, the key-value pairs selected by deformable points lack semantic relevance when fine-tuning for semantic segmentation tasks. The query-aware sparsity attention in BiFormer seeks to focus each query on top-k routed regions. However, during attention calculation, the selected key-value pairs are influenced by too many irrelevant queries, reducing attention on the more important ones. To address these issues, we propose the Deformable Bi-level Routing Attention (DBRA) module, which optimizes the selection of key-value pairs using agent queries and enhances the interpretability of queries in attention maps. Based on this, we introduce the Deformable Bi-level Routing Attention Transformer (DeBiFormer), a novel general-purpose vision transformer built with the DBRA module. DeBiFormer has been validated on various computer vision tasks, including image classification, object detection, and semantic segmentation, providing strong evidence of its effectiveness.Code is available at {https://github.com/maclong01/DeBiFormer}

* ACCV 2024 * 20 pages, 7 figures. arXiv admin note: text overlap with arXiv:2303.08810 by other authors

View paper on

Share this with someone who'll enjoy it:

Title:DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention

Paper and Code