Abstract:The roadmap is organized into several thematic sections, outlining current computing challenges, discussing the neuromorphic computing approach, analyzing mature and currently utilized technologies, providing an overview of emerging technologies, addressing material challenges, exploring novel computing concepts, and finally examining the maturity level of emerging technologies while determining the next essential steps for their advancement.
Abstract:Catastrophic forgetting remains a challenge for neural networks, especially in lifelong learning scenarios. In this study, we introduce MEtaplasticity from Synaptic Uncertainty (MESU), inspired by metaplasticity and Bayesian inference principles. MESU harnesses synaptic uncertainty to retain information over time, with its update rule closely approximating the diagonal Newton's method for synaptic updates. Through continual learning experiments on permuted MNIST tasks, we demonstrate MESU's remarkable capability to maintain learning performance across 100 tasks without the need of explicit task boundaries.
Abstract:An increasing number of neuroscience studies are highlighting the importance of spatial dendritic branching in pyramidal neurons in the brain for supporting non-linear computation through localized synaptic integration. In particular, dendritic branches play a key role in temporal signal processing and feature detection, using coincidence detection (CD) mechanisms, made possible by the presence of synaptic delays that align temporally disparate inputs for effective integration. Computational studies on spiking neural networks further highlight the significance of delays for CD operations, enabling spatio-temporal pattern recognition within feed-forward neural networks without the need for recurrent architectures. In this work, we present DenRAM, the first realization of a spiking neural network with analog dendritic circuits, integrated into a 130nm technology node coupled with resistive memory (RRAM) technology. DenRAM's dendritic circuits use the RRAM devices to implement both delays and synaptic weights in the network. By configuring the RRAM devices to reproduce bio-realistic timescales, and through exploiting their heterogeneity, we experimentally demonstrate DenRAM's capability to replicate synaptic delay profiles, and efficiently implement CD for spatio-temporal pattern recognition. To validate the architecture, we conduct comprehensive system-level simulations on two representative temporal benchmarks, highlighting DenRAM's resilience to analog hardware noise, and its superior accuracy compared to recurrent architectures with an equivalent number of parameters. DenRAM not only brings rich temporal processing capabilities to neuromorphic architectures, but also reduces the memory footprint of edge devices, provides high accuracy on temporal benchmarks, and represents a significant step-forward in low-power real-time signal processing technologies.
Abstract:Deep learning has made remarkable progress in various tasks, surpassing human performance in some cases. However, one drawback of neural networks is catastrophic forgetting, where a network trained on one task forgets the solution when learning a new one. To address this issue, recent works have proposed solutions based on Binarized Neural Networks (BNNs) incorporating metaplasticity. In this work, we extend this solution to quantized neural networks (QNNs) and present a memristor-based hardware solution for implementing metaplasticity during both inference and training. We propose a hardware architecture that integrates quantized weights in memristor devices programmed in an analog multi-level fashion with a digital processing unit for high-precision metaplastic storage. We validated our approach using a combined software framework and memristor based crossbar array for in-memory computing fabricated in 130 nm CMOS technology. Our experimental results show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST, equal to software baseline. This result demonstrates immunity to catastrophic forgetting and the resilience to analog device imperfections of the proposed solution. Moreover, our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory
Abstract:Biological neurons can detect complex spatio-temporal features in spiking patterns via their synapses spread across across their dendritic branches. This is achieved by modulating the efficacy of the individual synapses, and by exploiting the temporal delays of their response to input spikes, depending on their position on the dendrite. Inspired by this mechanism, we propose a neuromorphic hardware architecture equipped with multiscale dendrites, each of which has synapses with tunable weight and delay elements. Weights and delays are both implemented using Resistive Random Access Memory (RRAM). We exploit the variability in the high resistance state of RRAM to implement a distribution of delays in the millisecond range for enabling spatio-temporal detection of sensory signals. We demonstrate the validity of the approach followed with a RRAM-aware simulation of a heartbeat anomaly detection task. In particular we show that, by incorporating delays directly into the network, the network's power and memory footprint can be reduced by up to 100x compared to equivalent state-of-the-art spiking recurrent networks with no delays.
Abstract:The implementation of current deep learning training algorithms is power-hungry, owing to data transfer between memory and logic units. Oxide-based RRAMs are outstanding candidates to implement in-memory computing, which is less power-intensive. Their weak RESET regime, is particularly attractive for learning, as it allows tuning the resistance of the devices with remarkable endurance. However, the resistive change behavior in this regime suffers many fluctuations and is particularly challenging to model, especially in a way compatible with tools used for simulating deep learning. In this work, we present a model of the weak RESET process in hafnium oxide RRAM and integrate this model within the PyTorch deep learning framework. Validated on experiments on a hybrid CMOS/RRAM technology, our model reproduces both the noisy progressive behavior and the device-to-device (D2D) variability. We use this tool to train Binarized Neural Networks for the MNIST handwritten digit recognition task and the CIFAR-10 object classification task. We simulate our model with and without various aspects of device imperfections to understand their impact on the training process and identify that the D2D variability is the most detrimental aspect. The framework can be used in the same manner for other types of memories to identify the device imperfections that cause the most degradation, which can, in turn, be used to optimize the devices to reduce the impact of these imperfections.
Abstract:Reading several ReRAMs simultaneously in a neuromorphic circuit increases power consumption and limits scalability. Applying small inference read pulses is a vain attempt when offset voltages of the read-out circuit are decisively more. This paper presents an experimental validation of a three-stage calibration scheme to calibrate the DC offset voltage across the rows of the memristive crossbar. The proposed method is based on biasing the body terminal of one of the differential pair MOSFETs of the buffer through a series of cascaded resistor banks arranged in three stages: coarse, fine and finer stages. The circuit is designed in a 130 nm CMOS technology, where the OxRAM-based binary memristors are built on top of it. A dedicated PCB and other auxiliary boards have been designed for testing the chip. Experimental results validate the presented approach, which is only limited by mismatch and electrical noise.
Abstract:Neuromorphic computing is henceforth a major research field for both academic and industrial actors. As opposed to Von Neumann machines, brain-inspired processors aim at bringing closer the memory and the computational elements to efficiently evaluate machine-learning algorithms. Recently, Spiking Neural Networks, a generation of cognitive algorithms employing computational primitives mimicking neuron and synapse operational principles, have become an important part of deep learning. They are expected to improve the computational performance and efficiency of neural networks, but are best suited for hardware able to support their temporal dynamics. In this survey, we present the state of the art of hardware implementations of spiking neural networks and the current trends in algorithm elaboration from model selection to training mechanisms. The scope of existing solutions is extensive; we thus present the general framework and study on a case-by-case basis the relevant particularities. We describe the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level and discuss their related advantages and challenges.