Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Apr 23, 2024

Abhishek Aich, Yumin Suh, Samuel Schulter, Manmohan Chandraker

Figure 1 for Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Figure 2 for Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Figure 3 for Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Figure 4 for Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Share this with someone who'll enjoy it:

Abstract:A powerful architecture for universal segmentation relies on transformers that encode multi-scale image features and decode object queries into mask predictions. With efficiency being a high priority for scaling such models, we observed that the state-of-the-art method Mask2Former uses ~50% of its compute only on the transformer encoder. This is due to the retention of a full-length token-level representation of all backbone feature scales at each encoder layer. With this observation, we propose a strategy termed PROgressive Token Length SCALing for Efficient transformer encoders (PRO-SCALE) that can be plugged-in to the Mask2Former-style segmentation architectures to significantly reduce the computational cost. The underlying principle of PRO-SCALE is: progressively scale the length of the tokens with the layers of the encoder. This allows PRO-SCALE to reduce computations by a large margin with minimal sacrifice in performance (~52% GFLOPs reduction with no drop in performance on COCO dataset). We validate our framework on multiple public benchmarks.

View paper on

Share this with someone who'll enjoy it:

Title:Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

Paper and Code