Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shreyas K. Venkataramanaiah

Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Apr 19, 2018

Shihui Yin, Gaurav Srivastava, Shreyas K. Venkataramanaiah, Chaitali Chakrabarti, Visar Berisha, Jae-sun Seo

Figure 1 for Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Figure 2 for Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Figure 3 for Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Figure 4 for Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Abstract:Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which makes it challenging to implement them on power/area-constrained embedded platforms. To reduce the network size, several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. In addition, many recent works have focused on reducing precision of activations and weights with some reducing down to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. In this work, we present design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. During training, both binarization/low-precision and structured sparsity are applied as constraints to find the smallest memory footprint for a given deep learning algorithm. The DNN model for CIFAR-10 dataset with weight memory reduction of 50X exhibits accuracy comparable to that of the floating-point counterpart. Area, performance and energy results of DNN hardware in 40nm CMOS are reported for the MNIST dataset. The optimized DNN that combines 8X structured compression and 3-bit weight precision showed 98.4% accuracy at 20nJ per classification.

* 2017 Asilomar Conference on Signals, Systems and Computers

Via

Access Paper or Ask Questions

Algorithm and Hardware Design of Discrete-Time Spiking Neural Networks Based on Back Propagation with Binary Activations

Sep 19, 2017

Shihui Yin, Shreyas K. Venkataramanaiah, Gregory K. Chen, Ram Krishnamurthy, Yu Cao, Chaitali Chakrabarti, Jae-sun Seo

Figure 1 for Algorithm and Hardware Design of Discrete-Time Spiking Neural Networks Based on Back Propagation with Binary Activations

Figure 2 for Algorithm and Hardware Design of Discrete-Time Spiking Neural Networks Based on Back Propagation with Binary Activations

Figure 3 for Algorithm and Hardware Design of Discrete-Time Spiking Neural Networks Based on Back Propagation with Binary Activations

Figure 4 for Algorithm and Hardware Design of Discrete-Time Spiking Neural Networks Based on Back Propagation with Binary Activations

Abstract:We present a new back propagation based training algorithm for discrete-time spiking neural networks (SNN). Inspired by recent deep learning algorithms on binarized neural networks, binary activation with a straight-through gradient estimator is used to model the leaky integrate-fire spiking neuron, overcoming the difficulty in training SNNs using back propagation. Two SNN training algorithms are proposed: (1) SNN with discontinuous integration, which is suitable for rate-coded input spikes, and (2) SNN with continuous integration, which is more general and can handle input spikes with temporal information. Neuromorphic hardware designed in 40nm CMOS exploits the spike sparsity and demonstrates high classification accuracy (>98% on MNIST) and low energy (48.4-773 nJ/image).

* 2017 IEEE Biomedical Circuits and Systems (BioCAS)

Via

Access Paper or Ask Questions