Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Position-based Scaled Gradient for Model Quantization and Sparse Training

Jun 10, 2020

Jangho Kim, KiYoon Yoo, Nojun Kwak

Figure 1 for Position-based Scaled Gradient for Model Quantization and Sparse Training

Figure 2 for Position-based Scaled Gradient for Model Quantization and Sparse Training

Figure 3 for Position-based Scaled Gradient for Model Quantization and Sparse Training

Figure 4 for Position-based Scaled Gradient for Model Quantization and Sparse Training

Share this with someone who'll enjoy it:

Abstract:We propose the position-based scaled gradient (PSG) that scales the gradient depending on the position of a weight vector to make it more compression-friendly. First, we theoretically show that applying PSG to the standard gradient descent (GD), which is called PSGD, is equivalent to the GD in the warped weight space, a space made by warping the original weight space via an appropriately designed invertible function. Second, we empirically show that PSG acting as a regularizer to a weight vector is very useful in model compression domains such as quantization and sparse training. PSG reduces the gap between the weight distributions of a full-precision model and its compressed counterpart. This enables the versatile deployment of a model either as an uncompressed mode or as a compressed mode depending on the availability of resources. The experimental results on CIFAR-10/100 and Imagenet datasets show the effectiveness of the proposed PSG in both domains of sparse training and quantization even for extremely low bits.

View paper on

Share this with someone who'll enjoy it:

Title:Position-based Scaled Gradient for Model Quantization and Sparse Training

Paper and Code