Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sumit Bam Shrestha

Region Masking to Accelerate Video Processing on Neuromorphic Hardware

Mar 21, 2025

Sreetama Sarkar, Sumit Bam Shrestha, Yue Che, Leobardo Campos-Macias, Gourav Datta, Peter A. Beerel

Abstract:The rapidly growing demand for on-chip edge intelligence on resource-constrained devices has motivated approaches to reduce energy and latency of deep learning models. Spiking neural networks (SNNs) have gained particular interest due to their promise to reduce energy consumption using event-based processing. We assert that while sigma-delta encoding in SNNs can take advantage of the temporal redundancy across video frames, they still involve a significant amount of redundant computations due to processing insignificant events. In this paper, we propose a region masking strategy that identifies regions of interest at the input of the SNN, thereby eliminating computation and data movement for events arising from unimportant regions. Our approach demonstrates that masking regions at the input not only significantly reduces the overall spiking activity of the network, but also provides significant improvement in throughput and latency. We apply region masking during video object detection on Loihi 2, demonstrating that masking approximately 60% of input regions can reduce energy-delay product by 1.65x over a baseline sigma-delta network, with a degradation in mAP@0.5 by 1.09%.

Via

Access Paper or Ask Questions

Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity

Feb 03, 2025

Alessandro Pierro, Steven Abreu, Jonathan Timcheck, Philipp Stratmann, Andreas Wild, Sumit Bam Shrestha

Abstract:Linear recurrent neural networks enable powerful long-range sequence modeling with constant memory usage and time-per-token during inference. These architectures hold promise for streaming applications at the edge, but deployment in resource-constrained environments requires hardware-aware optimizations to minimize latency and energy consumption. Unstructured sparsity offers a compelling solution, enabling substantial reductions in compute and memory requirements--when accelerated by compatible hardware platforms. In this paper, we conduct a scaling study to investigate the Pareto front of performance and efficiency across inference compute budgets. We find that highly sparse linear RNNs consistently achieve better efficiency-performance trade-offs than dense baselines, with 2x less compute and 36% less memory at iso-accuracy. Our models achieve state-of-the-art results on a real-time streaming task for audio denoising. By quantizing our sparse models to fixed-point arithmetic and deploying them on the Intel Loihi 2 neuromorphic chip for real-time processing, we translate model compression into tangible gains of 42x lower latency and 149x lower energy consumption compared to a dense model on an edge GPU. Our findings showcase the transformative potential of unstructured sparsity, paving the way for highly efficient recurrent neural networks in real-world, resource-constrained environments.

* Under review

Via

Access Paper or Ask Questions

Emulating Brain-like Rapid Learning in Neuromorphic Edge Computing

Aug 28, 2024

Kenneth Stewart, Michael Neumeier, Sumit Bam Shrestha, Garrick Orchard, Emre Neftci

Figure 1 for Emulating Brain-like Rapid Learning in Neuromorphic Edge Computing

Figure 2 for Emulating Brain-like Rapid Learning in Neuromorphic Edge Computing

Figure 3 for Emulating Brain-like Rapid Learning in Neuromorphic Edge Computing

Figure 4 for Emulating Brain-like Rapid Learning in Neuromorphic Edge Computing

Abstract:Achieving personalized intelligence at the edge with real-time learning capabilities holds enormous promise in enhancing our daily experiences and helping decision making, planning, and sensing. However, efficient and reliable edge learning remains difficult with current technology due to the lack of personalized data, insufficient hardware capabilities, and inherent challenges posed by online learning. Over time and across multiple developmental stages, the brain has evolved to efficiently incorporate new knowledge by gradually building on previous knowledge. In this work, we emulate the multiple stages of learning with digital neuromorphic technology that simulates the neural and synaptic processes of the brain using two stages of learning. First, a meta-training stage trains the hyperparameters of synaptic plasticity for one-shot learning using a differentiable simulation of the neuromorphic hardware. This meta-training process refines a hardware local three-factor synaptic plasticity rule and its associated hyperparameters to align with the trained task domain. In a subsequent deployment stage, these optimized hyperparameters enable fast, data-efficient, and accurate learning of new classes. We demonstrate our approach using event-driven vision sensor data and the Intel Loihi neuromorphic processor with its plasticity dynamics, achieving real-time one-shot learning of new classes that is vastly improved over transfer learning. Our methodology can be deployed with arbitrary plasticity models and can be applied to situations demanding quick learning and adaptation at the edge, such as navigating unfamiliar environments or learning unexpected categories of data through user engagement.

* 17 page journal article. Submitted to IOP NCE

Via

Access Paper or Ask Questions

Efficient Video and Audio processing with Loihi 2

Oct 05, 2023

Sumit Bam Shrestha, Jonathan Timcheck, Paxon Frady, Leobardo Campos-Macias, Mike Davies

Abstract:Loihi 2 is an asynchronous, brain-inspired research processor that generalizes several fundamental elements of neuromorphic architecture, such as stateful neuron models communicating with event-driven spikes, in order to address limitations of the first generation Loihi. Here we explore and characterize some of these generalizations, such as sigma-delta encapsulation, resonate-and-fire neurons, and integer-valued spikes, as applied to standard video, audio, and signal processing tasks. We find that these new neuromorphic approaches can provide orders of magnitude gains in combined efficiency and latency (energy-delay-product) for feed-forward and convolutional neural networks applied to video, audio denoising, and spectral transforms compared to state-of-the-art solutions.

* 5 pages, 3 figures

Via

Access Paper or Ask Questions

NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

Apr 15, 2023

Jason Yik, Soikat Hasan Ahmed, Zergham Ahmed, Brian Anderson, Andreas G. Andreou, Chiara Bartolozzi, Arindam Basu, Douwe den Blanken, Petrut Bogdan, Sander Bohte(+62 more)

Figure 1 for NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

Figure 2 for NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

Figure 3 for NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

Figure 4 for NeuroBench: Advancing Neuromorphic Computing through Collaborative, Fair and Representative Benchmarking

Abstract:The field of neuromorphic computing holds great promise in terms of advancing computing efficiency and capabilities by following brain-inspired principles. However, the rich diversity of techniques employed in neuromorphic research has resulted in a lack of clear standards for benchmarking, hindering effective evaluation of the advantages and strengths of neuromorphic methods compared to traditional deep-learning-based methods. This paper presents a collaborative effort, bringing together members from academia and the industry, to define benchmarks for neuromorphic computing: NeuroBench. The goals of NeuroBench are to be a collaborative, fair, and representative benchmark suite developed by the community, for the community. In this paper, we discuss the challenges associated with benchmarking neuromorphic solutions, and outline the key features of NeuroBench. We believe that NeuroBench will be a significant step towards defining standards that can unify the goals of neuromorphic computing and drive its technological progress. Please visit neurobench.ai for the latest updates on the benchmark tasks and metrics.

Via

Access Paper or Ask Questions

The Intel Neuromorphic DNS Challenge

Mar 17, 2023

Jonathan Timcheck, Sumit Bam Shrestha, Daniel Ben Dayan Rubin, Adam Kupryjanow, Garrick Orchard, Lukasz Pindor, Timothy Shea, Mike Davies

Abstract:A critical enabler for progress in neuromorphic computing research is the ability to transparently evaluate different neuromorphic solutions on important tasks and to compare them to state-of-the-art conventional solutions. The Intel Neuromorphic Deep Noise Suppression Challenge (Intel N-DNS Challenge), inspired by the Microsoft DNS Challenge, tackles a ubiquitous and commercially relevant task: real-time audio denoising. Audio denoising is likely to reap the benefits of neuromorphic computing due to its low-bandwidth, temporal nature and its relevance for low-power devices. The Intel N-DNS Challenge consists of two tracks: a simulation-based algorithmic track to encourage algorithmic innovation, and a neuromorphic hardware (Loihi 2) track to rigorously evaluate solutions. For both tracks, we specify an evaluation methodology based on energy, latency, and resource consumption in addition to output audio quality. We make the Intel N-DNS Challenge dataset scripts and evaluation code freely accessible, encourage community participation with monetary prizes, and release a neuromorphic baseline solution which shows promising audio quality, high power efficiency, and low resource consumption when compared to Microsoft NsNet2 and a proprietary Intel denoising model used in production. We hope the Intel N-DNS Challenge will hasten innovation in neuromorphic algorithms research, especially in the area of training tools and methods for real-time signal processing. We expect the winners of the challenge will demonstrate that for problems like audio denoising, significant gains in power and resources can be realized on neuromorphic devices available today compared to conventional state-of-the-art solutions.

* 13 pages, 4 figures, 1 table

Via

Access Paper or Ask Questions

Spikemax: Spike-based Loss Methods for Classification

May 19, 2022

Sumit Bam Shrestha, Longwei Zhu, Pengfei Sun

Figure 1 for Spikemax: Spike-based Loss Methods for Classification

Figure 2 for Spikemax: Spike-based Loss Methods for Classification

Figure 3 for Spikemax: Spike-based Loss Methods for Classification

Figure 4 for Spikemax: Spike-based Loss Methods for Classification

Abstract:Spiking Neural Networks~(SNNs) are a promising research paradigm for low power edge-based computing. Recent works in SNN backpropagation has enabled training of SNNs for practical tasks. However, since spikes are binary events in time, standard loss formulations are not directly compatible with spike output. As a result, current works are limited to using mean-squared loss of spike count. In this paper, we formulate the output probability interpretation from the spike count measure and introduce spike-based negative log-likelihood measure which are more suited for classification tasks especially in terms of the energy efficiency and inference latency. We compare our loss measures with other existing alternatives and evaluate using classification performances on three neuromorphic benchmark datasets: NMNIST, DVS Gesture and N-TIDIGITS18. In addition, we demonstrate state of the art performances on these datasets, achieving faster inference speed and less energy consumption.

* Accepted by IJCNN 2022

Via

Access Paper or Ask Questions

Efficient Neuromorphic Signal Processing with Loihi 2

Nov 05, 2021

Garrick Orchard, E. Paxon Frady, Daniel Ben Dayan Rubin, Sophia Sanborn, Sumit Bam Shrestha, Friedrich T. Sommer, Mike Davies

Figure 1 for Efficient Neuromorphic Signal Processing with Loihi 2

Figure 2 for Efficient Neuromorphic Signal Processing with Loihi 2

Figure 3 for Efficient Neuromorphic Signal Processing with Loihi 2

Figure 4 for Efficient Neuromorphic Signal Processing with Loihi 2

Abstract:The biologically inspired spiking neurons used in neuromorphic computing are nonlinear filters with dynamic state variables -- very different from the stateless neuron models used in deep learning. The next version of Intel's neuromorphic research processor, Loihi 2, supports a wide range of stateful spiking neuron models with fully programmable dynamics. Here we showcase advanced spiking neuron models that can be used to efficiently process streaming data in simulation experiments on emulated Loihi 2 hardware. In one example, Resonate-and-Fire (RF) neurons are used to compute the Short Time Fourier Transform (STFT) with similar computational complexity but 47x less output bandwidth than the conventional STFT. In another example, we describe an algorithm for optical flow estimation using spatiotemporal RF neurons that requires over 90x fewer operations than a conventional DNN-based solution. We also demonstrate promising preliminary results using backpropagation to train RF neurons for audio classification tasks. Finally, we show that a cascade of Hopf resonators - a variant of the RF neuron - replicates novel properties of the cochlea and motivates an efficient spike-based spectrogram encoder.

Via

Access Paper or Ask Questions

Online Few-shot Gesture Learning on a Neuromorphic Processor

Aug 03, 2020

Kenneth Stewart, Garrick Orchard, Sumit Bam Shrestha, Emre Neftci

Figure 1 for Online Few-shot Gesture Learning on a Neuromorphic Processor

Figure 2 for Online Few-shot Gesture Learning on a Neuromorphic Processor

Figure 3 for Online Few-shot Gesture Learning on a Neuromorphic Processor

Figure 4 for Online Few-shot Gesture Learning on a Neuromorphic Processor

Abstract:We present the Surrogate-gradient Online Error-triggered Learning (SOEL) system for online few-shot learningon neuromorphic processors. The SOEL learning system usesa combination of transfer learning and principles of computa-tional neuroscience and deep learning. We show that partiallytrained deep Spiking Neural Networks (SNNs) implemented onneuromorphic hardware can rapidly adapt online to new classesof data within a domain. SOEL updates trigger when an erroroccurs, enabling faster learning with fewer updates. Using gesturerecognition as a case study, we show SOEL can be used for onlinefew-shot learning of new classes of pre-recorded gesture data andrapid online learning of new gestures from data streamed livefrom a Dynamic Active-pixel Vision Sensor to an Intel Loihineuromorphic research processor.

* 10 pages, submitted to IEEE JETCAS for review

Via

Access Paper or Ask Questions

Event-Based Angular Velocity Regression with Spiking Networks

Mar 05, 2020

Mathias Gehrig, Sumit Bam Shrestha, Daniel Mouritzen, Davide Scaramuzza

Figure 1 for Event-Based Angular Velocity Regression with Spiking Networks

Figure 2 for Event-Based Angular Velocity Regression with Spiking Networks

Figure 3 for Event-Based Angular Velocity Regression with Spiking Networks

Figure 4 for Event-Based Angular Velocity Regression with Spiking Networks

Abstract:Spiking Neural Networks (SNNs) are bio-inspired networks that process information conveyed as temporal spikes rather than numeric values. A spiking neuron of an SNN only produces a spike whenever a significant number of spikes occur within a short period of time. Due to their spike-based computational model, SNNs can process output from event-based, asynchronous sensors without any pre-processing at extremely lower power unlike standard artificial neural networks. This is possible due to specialized neuromorphic hardware that implements the highly-parallelizable concept of SNNs in silicon. Yet, SNNs have not enjoyed the same rise of popularity as artificial neural networks. This not only stems from the fact that their input format is rather unconventional but also due to the challenges in training spiking networks. Despite their temporal nature and recent algorithmic advances, they have been mostly evaluated on classification problems. We propose, for the first time, a temporal regression problem of numerical values given events from an event camera. We specifically investigate the prediction of the 3-DOF angular velocity of a rotating event camera with an SNN. The difficulty of this problem arises from the prediction of angular velocities continuously in time directly from irregular, asynchronous event-based input. Directly utilising the output of event cameras without any pre-processing ensures that we inherit all the benefits that they provide over conventional cameras. That is high-temporal resolution, high-dynamic range and no motion blur. To assess the performance of SNNs on this task, we introduce a synthetic event camera dataset generated from real-world panoramic images and show that we can successfully train an SNN to perform angular velocity regression.

* IEEE International Conference on Robotics and Automation (ICRA), Paris, 2020

Via

Access Paper or Ask Questions