Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Nov 25, 2020

Jun Nishikawa, Ryoji Ikegaya

Figure 1 for Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Figure 2 for Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Figure 3 for Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Figure 4 for Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Share this with someone who'll enjoy it:

Abstract:Deep Neural Networks(DNNs) have many parameters and activation data, and these both are expensive to implement. One method to reduce the size of the DNN is to quantize the pre-trained model by using a low-bit expression for weights and activations, using fine-tuning to recover the drop in accuracy. However, it is generally difficult to train neural networks which use low-bit expressions. One reason is that the weights in the middle layer of the DNN have a wide dynamic range and so when quantizing the wide dynamic range into a few bits, the step size becomes large, which leads to a large quantization error and finally a large degradation in accuracy. To solve this problem, this paper makes the following three contributions without using any additional learning parameters and hyper-parameters. First, we analyze how batch normalization, which causes the aforementioned problem, disturbs the fine-tuning of the quantized DNN. Second, based on these results, we propose a new pruning method called Pruning for Quantization (PfQ) which removes the filters that disturb the fine-tuning of the DNN while not affecting the inferred result as far as possible. Third, we propose a workflow of fine-tuning for quantized DNNs using the proposed pruning method(PfQ). Experiments using well-known models and datasets confirmed that the proposed method achieves higher performance with a similar model size than conventional quantization methods including fine-tuning.

* updated for ICLR2021 OpenReview rebuttal

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural Networks

Paper and Code