Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chung-Kuei Lee

RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Nov 09, 2023

Anastasiia Prutianova, Alexey Zaytsev, Chung-Kuei Lee, Fengyu Sun, Ivan Koryakovskiy

Figure 1 for RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Figure 2 for RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Figure 3 for RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Figure 4 for RepQ: Generalizing Quantization-Aware Training for Re-Parametrized Architectures

Abstract:Existing neural networks are memory-consuming and computationally intensive, making deploying them challenging in resource-constrained environments. However, there are various methods to improve their efficiency. Two such methods are quantization, a well-known approach for network compression, and re-parametrization, an emerging technique designed to improve model performance. Although both techniques have been studied individually, there has been limited research on their simultaneous application. To address this gap, we propose a novel approach called RepQ, which applies quantization to re-parametrized networks. Our method is based on the insight that the test stage weights of an arbitrary re-parametrized layer can be presented as a differentiable function of trainable parameters. We enable quantization-aware training by applying quantization on top of this function. RepQ generalizes well to various re-parametrized models and outperforms the baseline method LSQ quantization scheme in all experiments.

* BMVC 2023 (Oral)

Via

Access Paper or Ask Questions

Differentiable Channel Pruning Search

Oct 28, 2020

Yu Zhao, Chung-Kuei Lee

Figure 1 for Differentiable Channel Pruning Search

Figure 2 for Differentiable Channel Pruning Search

Figure 3 for Differentiable Channel Pruning Search

Figure 4 for Differentiable Channel Pruning Search

Abstract:In this paper, we propose the differentiable channel pruning search (DCPS) of convolutional neural networks. Unlike traditional channel pruning algorithms which require users to manually set prune ratio for each convolutional layer, DCPS search the optimal combination of prune ratio that automatically. Inspired by the differentiable architecture search (DARTS), we draws lessons from the continuous relaxation and leverages the gradient information to balance the metrics and performance. However, directly applying the DARTS scheme will cause channel mismatching problem and huge memory consumption. Therefore, we introduce a novel weight sharing technique which can elegantly eliminate the shape mismatching problem with negligible additional resource. We test the proposed algorithm on image classification task and it achieves the state-of-the-art pruning results for image classification on CIFAR-10, CIFAR-100 and ImageNet. DCPS is further utilized for semantic segmentation on PASCAL VOC 2012 for two purposes. The first is to demonstrate that task-specific channel pruning achieves better performance against transferring slim models, and the second is to prove the memory efficiency of DCPS as the task demand more memory budget than classification. Results of the experiments validate the effectiveness and wide applicability of DCPS.

Via

Access Paper or Ask Questions