Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Peiqin Sun

Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

Mar 13, 2023

Yun-Hao Cao, Peiqin Sun, Shuchang Zhou

Figure 1 for Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

Figure 2 for Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

Figure 3 for Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

Figure 4 for Three Guidelines You Should Know for Universally Slimmable Self-Supervised Learning

Abstract:We propose universally slimmable self-supervised learning (dubbed as US3L) to achieve better accuracy-efficiency trade-offs for deploying self-supervised models across different devices. We observe that direct adaptation of self-supervised learning (SSL) to universally slimmable networks misbehaves as the training process frequently collapses. We then discover that temporal consistent guidance is the key to the success of SSL for universally slimmable networks, and we propose three guidelines for the loss design to ensure this temporal consistency from a unified gradient perspective. Moreover, we propose dynamic sampling and group regularization strategies to simultaneously improve training efficiency and accuracy. Our US3L method has been empirically validated on both convolutional neural networks and vision transformers. With only once training and one copy of weights, our method outperforms various state-of-the-art methods (individually trained or not) on benchmarks including recognition, object detection and instance segmentation. Our code is available at https://github.com/megvii-research/US3L-CVPR2023.

* Accepted to CVPR 2023

Via

Access Paper or Ask Questions

Synergistic Self-supervised and Quantization Learning

Jul 12, 2022

Yun-Hao Cao, Peiqin Sun, Yechang Huang, Jianxin Wu, Shuchang Zhou

Figure 1 for Synergistic Self-supervised and Quantization Learning

Figure 2 for Synergistic Self-supervised and Quantization Learning

Figure 3 for Synergistic Self-supervised and Quantization Learning

Figure 4 for Synergistic Self-supervised and Quantization Learning

Abstract:With the success of self-supervised learning (SSL), it has become a mainstream paradigm to fine-tune from self-supervised pretrained models to boost the performance on downstream tasks. However, we find that current SSL models suffer severe accuracy drops when performing low-bit quantization, prohibiting their deployment in resource-constrained applications. In this paper, we propose a method called synergistic self-supervised and quantization learning (SSQL) to pretrain quantization-friendly self-supervised models facilitating downstream deployment. SSQL contrasts the features of the quantized and full precision models in a self-supervised fashion, where the bit-width for the quantized model is randomly selected in each step. SSQL not only significantly improves the accuracy when quantized to lower bit-widths, but also boosts the accuracy of full precision models in most cases. By only training once, SSQL can then benefit various downstream tasks at different bit-widths simultaneously. Moreover, the bit-width flexibility is achieved without additional storage overhead, requiring only one copy of weights during training and inference. We theoretically analyze the optimization process of SSQL, and conduct exhaustive experiments on various benchmarks to further demonstrate the effectiveness of our method. Our code is available at https://github.com/megvii-research/SSQL-ECCV2022.

* Accepted to ECCV 2022 oral

Via

Access Paper or Ask Questions

FQ-ViT: Fully Quantized Vision Transformer without Retraining

Nov 27, 2021

Yang Lin, Tianyu Zhang, Peiqin Sun, Zheng Li, Shuchang Zhou

Figure 1 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Figure 2 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Figure 3 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Figure 4 for FQ-ViT: Fully Quantized Vision Transformer without Retraining

Abstract:Network quantization significantly reduces model inference complexity and has been widely used in real-world deployments. However, most existing quantization methods have been developed and tested mainly on Convolutional Neural Networks (CNN), and suffer severe degradation when applied to Transformer-based architectures. In this work, we present a systematic method to reduce the performance degradation and inference complexity of Quantized Transformers. In particular, we propose Powers-of-Two Scale (PTS) to deal with the serious inter-channel variation of LayerNorm inputs in a hardware-friendly way. In addition, we propose Log-Int-Softmax (LIS) that can sustain the extreme non-uniform distribution of the attention maps while simplifying inference by using 4-bit quantization and the BitShift operator. Comprehensive experiments on various Transformer-based architectures and benchmarks show that our methods outperform previous works in performance while using even lower bit-width in attention maps. For instance, we reach 85.17% Top-1 accuracy with ViT-L on ImageNet and 51.4 mAP with Cascade Mask R-CNN (Swin-S) on COCO. To our knowledge, we are the first to achieve comparable accuracy degradation (~1%) on fully quantized Vision Transformers. Code is available at https://github.com/linyang-zhh/FQ-ViT.

* 12 pages, 8 figures, open sourced

Via

Access Paper or Ask Questions

Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Aug 30, 2020

Dachao Lin, Peiqin Sun, Guangzeng Xie, Shuchang Zhou, Zhihua Zhang

Figure 1 for Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Figure 2 for Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Figure 3 for Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Figure 4 for Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

Abstract:Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers for representing weight parameters and activations, and are often used in real-world applications due to their saving of computation resources and reproducibility of results. Batch Normalization (BN) poses a challenge for QNNs for requiring floating points in reciprocal operations, and previous QNNs either require computing BN at high precision or revise BN to some variants in heuristic ways. In this work, we propose a novel method to quantize BN by converting an affine transformation of two floating points to a fixed-point operation with shared quantized scale, which is friendly for hardware acceleration and model deployment. We confirm that our method maintains same outputs through rigorous theoretical analysis and numerical analysis. Accuracy and efficiency of our quantization method are verified by experiments at layer level on CIFAR and ImageNet datasets. We also believe that our method is potentially useful in other problems involving quantization.

Via

Access Paper or Ask Questions

Yes-Net: An effective Detector Based on Global Information

Jun 30, 2017

Liangzhuang Ma, Xin Kan, Qianjiang Xiao, Wenlong Liu, Peiqin Sun

Figure 1 for Yes-Net: An effective Detector Based on Global Information

Figure 2 for Yes-Net: An effective Detector Based on Global Information

Figure 3 for Yes-Net: An effective Detector Based on Global Information

Figure 4 for Yes-Net: An effective Detector Based on Global Information

Abstract:This paper introduces a new real-time object detection approach named Yes-Net. It realizes the prediction of bounding boxes and class via single neural network like YOLOv2 and SSD, but owns more efficient and outstanding features. It combines local information with global information by adding the RNN architecture as a packed unit in CNN model to form the basic feature extractor. Independent anchor boxes coming from full-dimension k-means is also applied in Yes-Net, it brings better average IOU than grid anchor box. In addition, instead of NMS, Yes-Net uses RNN as a filter to get the final boxes, which is more efficient. For 416 x 416 input, Yes-Net achieves 79.2% mAP on VOC2007 test at 39 FPS on an Nvidia Titan X Pascal.

Via

Access Paper or Ask Questions