Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Guihai Yan

Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Nov 14, 2018

Hang Lu, Xin Wei, Ning Lin, Guihai Yan, and Xiaowei Li

Figure 1 for Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Figure 2 for Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Figure 3 for Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Figure 4 for Tetris: Re-architecting Convolutional Neural Network Computation for Machine Learning Accelerators

Abstract:Inference efficiency is the predominant consideration in designing deep learning accelerators. Previous work mainly focuses on skipping zero values to deal with remarkable ineffectual computation, while zero bits in non-zero values, as another major source of ineffectual computation, is often ignored. The reason lies on the difficulty of extracting essential bits during operating multiply-and-accumulate (MAC) in the processing element. Based on the fact that zero bits occupy as high as 68.9% fraction in the overall weights of modern deep convolutional neural network models, this paper firstly proposes a weight kneading technique that could eliminate ineffectual computation caused by either zero value weights or zero bits in non-zero weights, simultaneously. Besides, a split-and-accumulate (SAC) computing pattern in replacement of conventional MAC, as well as the corresponding hardware accelerator design called Tetris are proposed to support weight kneading at the hardware level. Experimental results prove that Tetris could speed up inference up to 1.50x, and improve power efficiency up to 5.33x compared with the state-of-the-art baselines.

* ICCAD 2018 paper

Via

Access Paper or Ask Questions

AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference

May 21, 2018

Xin He, Liu Ke, Wenyan Lu, Guihai Yan, Xuan Zhang

Figure 1 for AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference

Figure 2 for AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference

Figure 3 for AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference

Figure 4 for AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference

Abstract:The intrinsic error tolerance of neural network (NN) makes approximate computing a promising technique to improve the energy efficiency of NN inference. Conventional approximate computing focuses on balancing the efficiency-accuracy trade-off for existing pre-trained networks, which can lead to suboptimal solutions. In this paper, we propose AxTrain, a hardware-oriented training framework to facilitate approximate computing for NN inference. Specifically, AxTrain leverages the synergy between two orthogonal methods---one actively searches for a network parameters distribution with high error tolerance, and the other passively learns resilient weights by numerically incorporating the noise distributions of the approximate hardware in the forward pass during the training phase. Experimental results from various datasets with near-threshold computing and approximation multiplication strategies demonstrate AxTrain's ability to obtain resilient neural network parameters and system energy efficiency improvement.

* In International Symposium on Low Power Electronics and Design (ISLPED) 2018

Via

Access Paper or Ask Questions