Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

May 20, 2016

Philipp Gysel

Figure 1 for Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Figure 2 for Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Figure 3 for Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Figure 4 for Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Share this with someone who'll enjoy it:

Abstract:Convolutional neural networks (CNN) have achieved major breakthroughs in recent years. Their performance in computer vision have matched and in some areas even surpassed human capabilities. Deep neural networks can capture complex non-linear features; however this ability comes at the cost of high computational and memory requirements. State-of-art networks require billions of arithmetic operations and millions of parameters. To enable embedded devices such as smartphones, Google glasses and monitoring cameras with the astonishing power of deep learning, dedicated hardware accelerators can be used to decrease both execution time and power consumption. In applications where fast connection to the cloud is not guaranteed or where privacy is important, computation needs to be done locally. Many hardware accelerators for deep neural networks have been proposed recently. A first important step of accelerator design is hardware-oriented approximation of deep networks, which enables energy-efficient inference. We present Ristretto, a fast and automated framework for CNN approximation. Ristretto simulates the hardware arithmetic of a custom hardware accelerator. The framework reduces the bit-width of network parameters and outputs of resource-intense layers, which reduces the chip area for multiplication units significantly. Alternatively, Ristretto can remove the need for multipliers altogether, resulting in an adder-only arithmetic. The tool fine-tunes trimmed networks to achieve high classification accuracy. Since training of deep neural networks can be time-consuming, Ristretto uses highly optimized routines which run on the GPU. This enables fast compression of any given network. Given a maximum tolerance of 1%, Ristretto can successfully condense CaffeNet and SqueezeNet to 8-bit. The code for Ristretto is available.

* Master's Thesis, University of California, Davis; 73 pages and 29 figures

View paper on

Share this with someone who'll enjoy it:

Title:Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Paper and Code