Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Streaming MANN: A Streaming-Based Inference for Energy-Efficient Memory-Augmented Neural Networks

May 21, 2018

Seongsik Park, Jaehee Jang, Seijoon Kim, Sungroh Yoon

Figure 1 for Streaming MANN: A Streaming-Based Inference for Energy-Efficient Memory-Augmented Neural Networks

Figure 2 for Streaming MANN: A Streaming-Based Inference for Energy-Efficient Memory-Augmented Neural Networks

Figure 3 for Streaming MANN: A Streaming-Based Inference for Energy-Efficient Memory-Augmented Neural Networks

Figure 4 for Streaming MANN: A Streaming-Based Inference for Energy-Efficient Memory-Augmented Neural Networks

Share this with someone who'll enjoy it:

Abstract:With the successful development of artificial intelligence using deep learning, there has been growing interest in its deployment. The mobile environment is the closest hardware platform to real life, and it has become an important platform for the success or failure of artificial intelligence. Memory-augmented neural networks (MANNs) are neural networks proposed to efficiently handle question-and-answer (Q&A) tasks, well-suited for mobile devices. As a MANN requires various types of operations and recurrent data paths, it is difficult to accelerate the inference in the structure designed for other conventional neural network models, which is one of the biggest obstacles to deploying MANNs in mobile environments. To address the aforementioned issues, we propose Streaming MANN. This is the first attempt to implement and demonstrate the architecture for energy-efficient inference of MANNs with the concept of streaming processing. To achieve the full potential of the streaming process, we propose a novel approach, called inference thresholding, using Bayesian approach considering the characteristics of natural language processing (NLP) tasks. To evaluate our proposed approaches, we implemented the architecture and method in a field-programmable gate array (FPGA) which is suitable for streaming processing. We measured the execution time and power consumption of the inference for the bAbI dataset. The experimental results showed that the performance efficiency per energy (FLOPS/kJ) of the Streaming MANN increased by a factor of up to about 126 compared to the results of NVIDIA TITAN V, and up to 140 if inference thresholding is applied.

View paper on

Share this with someone who'll enjoy it:

Title:Streaming MANN: A Streaming-Based Inference for Energy-Efficient Memory-Augmented Neural Networks

Paper and Code