Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Denis A. Gudovskiy

DNN Feature Map Compression using Learned Representation over GF(2)

Aug 15, 2018

Denis A. Gudovskiy, Alec Hodgkinson, Luca Rigazio

Figure 1 for DNN Feature Map Compression using Learned Representation over GF(2)

Figure 2 for DNN Feature Map Compression using Learned Representation over GF(2)

Figure 3 for DNN Feature Map Compression using Learned Representation over GF(2)

Figure 4 for DNN Feature Map Compression using Learned Representation over GF(2)

Abstract:In this paper, we introduce a method to compress intermediate feature maps of deep neural networks (DNNs) to decrease memory storage and bandwidth requirements during inference. Unlike previous works, the proposed method is based on converting fixed-point activations into vectors over the smallest GF(2) finite field followed by nonlinear dimensionality reduction (NDR) layers embedded into a DNN. Such an end-to-end learned representation finds more compact feature maps by exploiting quantization redundancies within the fixed-point activations along the channel or spatial dimensions. We apply the proposed network architectures derived from modified SqueezeNet and MobileNetV2 to the tasks of ImageNet classification and PASCAL VOC object detection. Compared to prior approaches, the conducted experiments show a factor of 2 decrease in memory requirements with minor degradation in accuracy while adding only bitwise computations.

* CEFRL2018

Via

Access Paper or Ask Questions

ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks

Jun 07, 2017

Denis A. Gudovskiy, Luca Rigazio

Figure 1 for ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks

Figure 2 for ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks

Figure 3 for ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks

Figure 4 for ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks

Abstract:In this paper we introduce ShiftCNN, a generalized low-precision architecture for inference of multiplierless convolutional neural networks (CNNs). ShiftCNN is based on a power-of-two weight representation and, as a result, performs only shift and addition operations. Furthermore, ShiftCNN substantially reduces computational cost of convolutional layers by precomputing convolution terms. Such an optimization can be applied to any CNN architecture with a relatively small codebook of weights and allows to decrease the number of product operations by at least two orders of magnitude. The proposed architecture targets custom inference accelerators and can be realized on FPGAs or ASICs. Extensive evaluation on ImageNet shows that the state-of-the-art CNNs can be converted without retraining into ShiftCNN with less than 1% drop in accuracy when the proposed quantization algorithm is employed. RTL simulations, targeting modern FPGAs, show that power consumption of convolutional layers is reduced by a factor of 4 compared to conventional 8-bit fixed-point architectures.

Via

Access Paper or Ask Questions