Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:FQ-ViT: Fully Quantized Vision Transformer without Retraining

Nov 27, 2021

Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, Shuchang Zhou

Figure 1 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Figure 2 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Figure 3 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Figure 4 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Share this with someone who'll enjoy it:

Abstract:Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments. However, most existing quantization methods have been developed and tested mainly on Convolutional Neural Networks (CNN), and suffer severe degradation when applied to Transformer-based architectures. In this work, we present a systematic method to reduce the performance degradation and inference complexity of Quantized Transformers. In particular, we propose Powers-of-Two Scale (PTS) to deal with the serious inter-channel variation of LayerNorm inputs in a hardware-friendly way. In addition, we propose Log-Int-Softmax (LIS) that can sustain the extreme non-uniform distribution of the attention maps while simplifying inference by using 4-bit quantization and the BitShift operator. Comprehensive experiments on various Transformer-based architectures and benchmarks show that our methods outperform previous works in performance while using even lower bit-width in attention maps. For instance, we reach 85.17% Top-1 accuracy with ViT-L on ImageNet and 51.4 mAP with Cascade Mask R-CNN (Swin-S) on COCO. To our knowledge, we are the first to achieve comparable accuracy degradation (~1%) on fully quantized Vision Transformers. Code is available at https://github.com/linyang-zhh/FQ-ViT.

* 12 pages, 8 figures, open sourced

View paper on

Share this with someone who'll enjoy it:

Title:FQ-ViT: Fully Quantized Vision Transformer without Retraining

Paper and Code