Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Fast Training-Free Compression Framework for Vision Transformers

Mar 04, 2023

Jung Hwan Heo, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram

Share this with someone who'll enjoy it:

Abstract:Token pruning has emerged as an effective solution to speed up the inference of large Transformer models. However, prior work on accelerating Vision Transformer (ViT) models requires training from scratch or fine-tuning with additional parameters, which prevents a simple plug-and-play. To avoid high training costs during the deployment stage, we present a fast training-free compression framework enabled by (i) a dense feature extractor in the initial layers; (ii) a sharpness-minimized model which is more compressible; and (iii) a local-global token merger that can exploit spatial relationships at various contexts. We applied our framework to various ViT and DeiT models and achieved up to 2x reduction in FLOPS and 1.8x speedup in inference throughput with <1% accuracy loss, while saving two orders of magnitude shorter training times than existing approaches. Code will be available at https://github.com/johnheo/fast-compress-vit

* Preprint. 13 pages, 9 Figures, 8 Tables

View paper on

Share this with someone who'll enjoy it:

Title:A Fast Training-Free Compression Framework for Vision Transformers

Paper and Code