Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Vision Transformer with Quadrangle Attention

Mar 27, 2023

Qiming Zhang, Jing Zhang, Yufei Xu, Dacheng Tao

Figure 1 for Vision Transformer with Quadrangle Attention

Figure 2 for Vision Transformer with Quadrangle Attention

Figure 3 for Vision Transformer with Quadrangle Attention

Figure 4 for Vision Transformer with Quadrangle Attention

Share this with someone who'll enjoy it:

Abstract:Window-based attention has become a popular choice in vision transformers due to its superior performance, lower computational complexity, and less memory footprint. However, the design of hand-crafted windows, which is data-agnostic, constrains the flexibility of transformers to adapt to objects of varying sizes, shapes, and orientations. To address this issue, we propose a novel quadrangle attention (QA) method that extends the window-based attention to a general quadrangle formulation. Our method employs an end-to-end learnable quadrangle regression module that predicts a transformation matrix to transform default windows into target quadrangles for token sampling and attention calculation, enabling the network to model various targets with different shapes and orientations and capture rich context information. We integrate QA into plain and hierarchical vision transformers to create a new architecture named QFormer, which offers minor code modifications and negligible extra computational cost. Extensive experiments on public benchmarks demonstrate that QFormer outperforms existing representative vision transformers on various vision tasks, including classification, object detection, semantic segmentation, and pose estimation. The code will be made publicly available at \href{https://github.com/ViTAE-Transformer/QFormer}{QFormer}.

* 15 pages, the extension of the ECCV 2022 paper (VSA: Learning Varied-Size Window Attention in Vision Transformers)

View paper on

Share this with someone who'll enjoy it:

Title:Vision Transformer with Quadrangle Attention

Paper and Code