Abstract:Neural approximate computing gains enormous energy-efficiency at the cost of tolerable quality-loss. A neural approximator can map the input data to output while a classifier determines whether the input data are safe to approximate with quality guarantee. However, existing works cannot maximize the invocation of the approximator, resulting in limited speedup and energy saving. By exploring the mapping space of those target functions, in this paper, we observe a nonuniform distribution of the approximation error incurred by the same approximator. We thus propose a novel approximate computing architecture with a Multiclass-Classifier and Multiple Approximators (MCMA). These approximators have identical network topologies and thus can share the same hardware resource in a neural processing unit(NPU) clip. In the runtime, MCMA can swap in the invoked approximator by merely shipping the synapse weights from the on-chip memory to the buffers near MAC within a cycle. We also propose efficient co-training methods for such MCMA architecture. Experimental results show a more substantial invocation of MCMA as well as the gain of energy-efficiency.
Abstract:Neural network based approximate computing is a universal architecture promising to gain tremendous energy-efficiency for many error resilient applications. To guarantee the approximation quality, existing works deploy two neural networks (NNs), e.g., an approximator and a predictor. The approximator provides the approximate results, while the predictor predicts whether the input data is safe to approximate with the given quality requirement. However, it is non-trivial and time-consuming to make these two neural network coordinate---they have different optimization objectives---by training them separately. This paper proposes a novel neural network structure---AXNet---to fuse two NNs to a holistic end-to-end trainable NN. Leveraging the philosophy of multi-task learning, AXNet can tremendously improve the invocation (proportion of safe-to-approximate samples) and reduce the approximation error. The training effort also decrease significantly. Experiment results show 50.7% more invocation and substantial cuts of training time when compared to existing neural network based approximate computing framework.