Abstract:Edge devices are being deployed at increasing volumes to sense and act on information from the physical world. The discrete Fourier transform (DFT) is often necessary to make this sensed data suitable for further processing $\unicode{x2013}$ such as by artificial intelligence (AI) algorithms $\unicode{x2013}$ and for transmission over communication networks. Analog in-memory computing has been shown to be a fast and energy-efficient solution for processing edge AI workloads, but not for Fourier transforms. This is because of the existence of the fast Fourier transform (FFT) algorithm, which enormously reduces the complexity of the DFT but has so far belonged only to digital processors. Here, we show that the FFT can be mapped to analog in-memory computing systems, enabling them to efficiently scale to arbitrarily large Fourier transforms without requiring large sizes or large numbers of non-volatile memory arrays. We experimentally demonstrate analog FFTs on 1D audio and 2D image signals, using a large-scale charge-trapping memory array with precisely tunable, low-conductance analog states. The scalability of both the new analog FFT approach and the charge-trapping memory device is leveraged to compute a 65,536-point analog DFT, a scale that is otherwise inaccessible by analog systems and which is $>$1000$\times$ larger than any previous analog DFT demonstration. The analog FFT also provides more numerically precise DFTs with greater tolerance to device and circuit non-idealities than a direct matrix-vector multiplication approach. We show that the extension of the FFT algorithm to analog in-memory processors leads to design considerations that differ markedly from digital implementations, and that analog Fourier transforms have a substantial power efficiency advantage at all size scales over FFTs implemented on state-of-the-art digital hardware.
Abstract:In neuromorphic computing, artificial synapses provide a multi-weight conductance state that is set based on inputs from neurons, analogous to the brain. Additional properties of the synapse beyond multiple weights can be needed, and can depend on the application, requiring the need for generating different synapse behaviors from the same materials. Here, we measure artificial synapses based on magnetic materials that use a magnetic tunnel junction and a magnetic domain wall. By fabricating lithographic notches in a domain wall track underneath a single magnetic tunnel junction, we achieve 4-5 stable resistance states that can be repeatably controlled electrically using spin orbit torque. We analyze the effect of geometry on the synapse behavior, showing that a trapezoidal device has asymmetric weight updates with high controllability, while a straight device has higher stochasticity, but with stable resistance levels. The device data is input into neuromorphic computing simulators to show the usefulness of application-specific synaptic functions. Implementing an artificial neural network applied on streamed Fashion-MNIST data, we show that the trapezoidal magnetic synapse can be used as a metaplastic function for efficient online learning. Implementing a convolutional neural network for CIFAR-100 image recognition, we show that the straight magnetic synapse achieves near-ideal inference accuracy, due to the stability of its resistance levels. This work shows multi-weight magnetic synapses are a feasible technology for neuromorphic computing and provides design guidelines for emerging artificial synapse technologies.
Abstract:Specialized accelerators have recently garnered attention as a method to reduce the power consumption of neural network inference. A promising category of accelerators utilizes nonvolatile memory arrays to both store weights and perform $\textit{in situ}$ analog computation inside the array. While prior work has explored the design space of analog accelerators to optimize performance and energy efficiency, there is seldom a rigorous evaluation of the accuracy of these accelerators. This work shows how architectural design decisions, particularly in mapping neural network parameters to analog memory cells, influence inference accuracy. When evaluated using ResNet50 on ImageNet, the resilience of the system to analog non-idealities - cell programming errors, analog-to-digital converter resolution, and array parasitic resistances - all improve when analog quantities in the hardware are made proportional to the weights in the network. Moreover, contrary to the assumptions of prior work, nearly equivalent resilience to cell imprecision can be achieved by fully storing weights as analog quantities, rather than spreading weight bits across multiple devices, often referred to as bit slicing. By exploiting proportionality, analog system designers have the freedom to match the precision of the hardware to the needs of the algorithm, rather than attempting to guarantee the same level of precision in the intermediate results as an equivalent digital accelerator. This ultimately results in an analog accelerator that is more accurate, more robust to analog errors, and more energy-efficient.
Abstract:Neuromorphic computing systems overcome the limitations of traditional von Neumann computing architectures. These computing systems can be further improved upon by using emerging technologies that are more efficient than CMOS for neural computation. Recent research has demonstrated memristors and spintronic devices in various neural network designs boost efficiency and speed. This paper presents a biologically inspired fully spintronic neuron used in a fully spintronic Hopfield RNN. The network is used to solve tasks, and the results are compared against those of current Hopfield neuromorphic architectures which use emerging technologies.
Abstract:Neuromorphic computing with spintronic devices has been of interest due to the limitations of CMOS-driven von Neumann computing. Domain wall-magnetic tunnel junction (DW-MTJ) devices have been shown to be able to intrinsically capture biological neuron behavior. Edgy-relaxed behavior, where a frequently firing neuron experiences a lower action potential threshold, may provide additional artificial neuronal functionality when executing repeated tasks. In this study, we demonstrate that this behavior can be implemented in DW-MTJ artificial neurons via three alternative mechanisms: shape anisotropy, magnetic field, and current-driven soft reset. Using micromagnetics and analytical device modeling to classify the Optdigits handwritten digit dataset, we show that edgy-relaxed behavior improves both classification accuracy and classification rate for ordered datasets while sacrificing little to no accuracy for a randomized dataset. This work establishes methods by which artificial spintronic neurons can be flexibly adapted to datasets.
Abstract:Complementary metal oxide semiconductor (CMOS) devices display volatile characteristics, and are not well suited for analog applications such as neuromorphic computing. Spintronic devices, on the other hand, exhibit both non-volatile and analog features, which are well-suited to neuromorphic computing. Consequently, these novel devices are at the forefront of beyond-CMOS artificial intelligence applications. However, a large quantity of these artificial neuromorphic devices still require the use of CMOS, which decreases the efficiency of the system. To resolve this, we have previously proposed a number of artificial neurons and synapses that do not require CMOS for operation. Although these devices are a significant improvement over previous renditions, their ability to enable neural network learning and recognition is limited by their intrinsic activation functions. This work proposes modifications to these spintronic neurons that enable configuration of the activation functions through control of the shape of a magnetic domain wall track. Linear and sigmoidal activation functions are demonstrated in this work, which can be extended through a similar approach to enable a wide variety of activation functions.
Abstract:Non-volatile memory arrays can deploy pre-trained neural network models for edge inference. However, these systems are affected by device-level noise and retention issues. Here, we examine damage caused by these effects, introduce a mitigation strategy, and demonstrate its use in fabricated array of SONOS (Silicon-Oxide-Nitride-Oxide-Silicon) devices. On MNIST, fashion-MNIST, and CIFAR-10 tasks, our approach increases resilience to synaptic noise and drift. We also show strong performance can be realized with ADCs of 5-8 bits precision.
Abstract:We propose a hardware learning rule for unsupervised clustering within a novel spintronic computing architecture. The proposed approach leverages the three-terminal structure of domain-wall magnetic tunnel junction devices to establish a feedback loop that serves to train such devices when they are used as synapses in a neuromorphic computing architecture.
Abstract:Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsupervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20 $\mu J$ even for large tasks used typically in machine learning.
Abstract:Neuromorphic-style inference only works well if limited hardware resources are maximized properly, e.g. accuracy continues to scale with parameters and complexity in the face of potential disturbance. In this work, we use realistic crossbar simulations to highlight that compact implementations of deep neural networks are unexpectedly susceptible to collapse from multiple system disturbances. Our work proposes a middle path towards high performance and strong resilience utilizing the Mosaics framework, and specifically by re-using synaptic connections in a recurrent neural network implementation that possesses a natural form of noise-immunity.