Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christiam F. Frasser

Fully-parallel Convolutional Neural Network Hardware

Jun 22, 2020

Christiam F. Frasser, Pablo Linares-Serrano, V. Canals, Miquel Roca, T. Serrano-Gotarredona, Josep L. Rossello

Figure 1 for Fully-parallel Convolutional Neural Network Hardware

Figure 2 for Fully-parallel Convolutional Neural Network Hardware

Figure 3 for Fully-parallel Convolutional Neural Network Hardware

Figure 4 for Fully-parallel Convolutional Neural Network Hardware

Abstract:A new trans-disciplinary knowledge area, Edge Artificial Intelligence or Edge Intelligence, is beginning to receive a tremendous amount of interest from the machine learning community due to the ever increasing popularization of the Internet of Things (IoT). Unfortunately, the incorporation of AI characteristics to edge computing devices presents the drawbacks of being power and area hungry for typical machine learning techniques such as Convolutional Neural Networks (CNN). In this work, we propose a new power-and-area-efficient architecture for implementing Articial Neural Networks (ANNs) in hardware, based on the exploitation of correlation phenomenon in Stochastic Computing (SC) systems. The architecture purposed can solve the difficult implementation challenges that SC presents for CNN applications, such as the high resources used in binary-tostochastic conversion, the inaccuracy produced by undesired correlation between signals, and the stochastic maximum function implementation. Compared with traditional binary logic implementations, experimental results showed an improvement of 19.6x and 6.3x in terms of speed performance and energy efficiency, for the FPGA implementation. We have also realized a full VLSI implementation of the proposed SC-CNN architecture demonstrating that our optimization achieve a 18x area reduction over previous SC-DNN architecture VLSI implementation in a comparable technological node. For the first time, a fully-parallel CNN as LENET-5 is embedded and tested in a single FPGA, showing the benefits of using stochastic computing for embedded applications, in contrast to traditional binary logic implementations.

* 8 pages, 6 figures, to be submitted to an IEEE journal

Via

Access Paper or Ask Questions

Reservoir Computing Hardware with Cellular Automata

Jun 21, 2018

Alejandro Morán, Christiam F. Frasser, Josep L. Rosselló

Figure 1 for Reservoir Computing Hardware with Cellular Automata

Figure 2 for Reservoir Computing Hardware with Cellular Automata

Figure 3 for Reservoir Computing Hardware with Cellular Automata

Figure 4 for Reservoir Computing Hardware with Cellular Automata

Abstract:Elementary cellular automata (ECA) is a widely studied one-dimensional processing methodology where the successive iteration of the automaton may lead to the recreation of a rich pattern dynamic. Recently, cellular automata have been proposed as a feasible way to implement Reservoir Computing (RC) systems in which the automata rule is fixed and the training is performed using a linear regression. In this work we perform an exhaustive study of the performance of the different ECA rules when applied to pattern recognition of time-independent input signals using a RC scheme. Once the different ECA rules have been tested, the most accurate one (rule 90) is selected to implement a digital circuit. Rule 90 is easily reproduced using a reduced set of XOR gates and shift-registers, thus representing a high-performance alternative for RC hardware implementation in terms of processing time, circuit area, power dissipation and system accuracy. The model (both in software and its hardware implementation) has been tested using a pattern recognition task of handwritten numbers (the MNIST database) for which we obtained competitive results in terms of accuracy, speed and power dissipation. The proposed model can be considered to be a low-cost method to implement fast pattern recognition digital circuits.

* 20 pages, 11 figures, draft of an article currently submitted to IEEE journal

Via

Access Paper or Ask Questions