Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

S. Burc Eryilmaz

Stanford University

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Aug 17, 2021

Weier Wan, Rajkumar Kubendran, Clemens Schaefer, S. Burc Eryilmaz, Wenqiang Zhang, Dabin Wu, Stephen Deiss, Priyanka Raina, He Qian, Bin Gao(+4 more)

Figure 1 for Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Figure 2 for Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Figure 3 for Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Abstract:Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency $5\times$ - $8\times$ better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0% on MNIST and 85.7% on CIFAR-10 image classification, 84.7% accuracy on Google speech command recognition, and a 70% reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.

* 34 pages, 14 figures, 1 table

Via

Access Paper or Ask Questions

Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

Oct 10, 2016

S. Burc Eryilmaz, Emre Neftci, Siddharth Joshi, SangBum Kim, Matthew BrightSky, Hsiang-Lan Lung, Chung Lam, Gert Cauwenberghs, H. -S. Philip Wong

Figure 1 for Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

Figure 2 for Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

Figure 3 for Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

Figure 4 for Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses

Abstract:Current large scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. Emerging memory technologies such as nanoscale two-terminal resistive switching memory devices offer a compact, scalable and low power alternative that permits on-chip co-located processing and memory in fine-grain distributed parallel architecture. Here we report first use of resistive switching memory devices for implementing and training a Restricted Boltzmann Machine (RBM), a generative probabilistic graphical model as a key component for unsupervised learning in deep networks. We experimentally demonstrate a 45-synapse RBM realized with 90 resistive switching phase change memory (PCM) elements trained with a bio-inspired variant of the Contrastive Divergence (CD) algorithm, implementing Hebbian and anti-Hebbian weight updates. The resistive PCM devices show a two-fold to ten-fold reduction in error rate in a missing pixel pattern completion task trained over 30 epochs, compared to untrained case. Measured programming energy consumption is 6.1 nJ per epoch with the resistive switching PCM devices, a factor of ~150 times lower than conventional processor-memory systems. We analyze and discuss the dependence of learning performance on cycle-to-cycle variations as well as number of gradual levels in the PCM analog memory devices.

* Accepted for publication in IEEE Transactions on Electron Devices. This version is the submitted version

Via

Access Paper or Ask Questions

Experimental Demonstration of Array-level Learning with Phase Change Synaptic Devices

Jun 03, 2014

S. Burc Eryilmaz, Duygu Kuzum, Rakesh G. D. Jeyasingh, SangBum Kim, Matthew BrightSky, Chung Lam, H. -S. Philip Wong

Figure 1 for Experimental Demonstration of Array-level Learning with Phase Change Synaptic Devices

Figure 2 for Experimental Demonstration of Array-level Learning with Phase Change Synaptic Devices

Figure 3 for Experimental Demonstration of Array-level Learning with Phase Change Synaptic Devices

Figure 4 for Experimental Demonstration of Array-level Learning with Phase Change Synaptic Devices

Abstract:The computational performance of the biological brain has long attracted significant interest and has led to inspirations in operating principles, algorithms, and architectures for computing and signal processing. In this work, we focus on hardware implementation of brain-like learning in a brain-inspired architecture. We demonstrate, in hardware, that 2-D crossbar arrays of phase change synaptic devices can achieve associative learning and perform pattern recognition. Device and array-level studies using an experimental 10x10 array of phase change synaptic devices have shown that pattern recognition is robust against synaptic resistance variations and large variations can be tolerated by increasing the number of training iterations. Our measurements show that increase in initial variation from 9 % to 60 % causes required training iterations to increase from 1 to 11.

* Electron Devices Meeting (IEDM), 2013 IEEE International, pp.25.5.1,25.5.4, 9-11 Dec. 2013
* IEDM 2013

Via

Access Paper or Ask Questions