Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ilkin Aliyev

Sparsity-Aware Hardware-Software Co-Design of Spiking Neural Networks: An Overview

Aug 26, 2024

Ilkin Aliyev, Kama Svoboda, Tosiron Adegbija, Jean-Marc Fellous

Abstract:Spiking Neural Networks (SNNs) are inspired by the sparse and event-driven nature of biological neural processing, and offer the potential for ultra-low-power artificial intelligence. However, realizing their efficiency benefits requires specialized hardware and a co-design approach that effectively leverages sparsity. We explore the hardware-software co-design of sparse SNNs, examining how sparsity representation, hardware architectures, and training techniques influence hardware efficiency. We analyze the impact of static and dynamic sparsity, discuss the implications of different neuron models and encoding schemes, and investigate the need for adaptability in hardware designs. Our work aims to illuminate the path towards embedded neuromorphic systems that fully exploit the computational advantages of sparse SNNs.

* IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC 2024)

Via

Access Paper or Ask Questions

GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures

Apr 24, 2024

Sahil Hassan, Michael Inouye, Miguel C. Gonzalez, Ilkin Aliyev, Joshua Mack, Maisha Hafiz, Ali Akoglu

Figure 1 for GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures

Figure 2 for GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures

Figure 3 for GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures

Figure 4 for GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures

Abstract:Open-source simulation tools play a crucial role for neuromorphic application engineers and hardware architects to investigate performance bottlenecks and explore design optimizations before committing to silicon. Reconfigurable Architecture for Neuromorphic Computing (RANC) is one such tool that offers ability to execute pre-trained Spiking Neural Network (SNN) models within a unified ecosystem through both software-based simulation and FPGA-based emulation. RANC has been utilized by the community with its flexible and highly parameterized design to study implementation bottlenecks, tune architectural parameters or modify neuron behavior based on application insights and study the trade space on hardware performance and network accuracy. In designing architectures for use in neuromorphic computing, there are an incredibly large number of configuration parameters such as number and precision of weights per neuron, neuron and axon counts per core, network topology, and neuron behavior. To accelerate such studies and provide users with a streamlined productive design space exploration, in this paper we introduce the GPU-based implementation of RANC. We summarize our parallelization approach and quantify the speedup gains achieved with GPU-based tick-accurate simulations across various use cases. We demonstrate up to 780 times speedup compared to serial version of the RANC simulator based on a 512 neuromorphic core MNIST inference application. We believe that the RANC ecosystem now provides a much more feasible avenue in the research of exploring different optimizations for accelerating SNNs and performing richer studies by enabling rapid convergence to optimized neuromorphic architectures.

* Accepted for publication in Neuro-Inspired Computational Elements (NICE) Workshop 2024

Via

Access Paper or Ask Questions

Fine-Tuning Surrogate Gradient Learning for Optimal Hardware Performance in Spiking Neural Networks

Feb 09, 2024

Ilkin Aliyev, Tosiron Adegbija

Figure 1 for Fine-Tuning Surrogate Gradient Learning for Optimal Hardware Performance in Spiking Neural Networks

Figure 2 for Fine-Tuning Surrogate Gradient Learning for Optimal Hardware Performance in Spiking Neural Networks

Abstract:The highly sparse activations in Spiking Neural Networks (SNNs) can provide tremendous energy efficiency benefits when carefully exploited in hardware. The behavior of sparsity in SNNs is uniquely shaped by the dataset and training hyperparameters. This work reveals novel insights into the impacts of training on hardware performance. Specifically, we explore the trade-offs between model accuracy and hardware efficiency. We focus on three key hyperparameters: surrogate gradient functions, beta, and membrane threshold. Results on an FPGA-based hardware platform show that the fast sigmoid surrogate function yields a lower firing rate with similar accuracy compared to the arctangent surrogate on the SVHN dataset. Furthermore, by cross-sweeping the beta and membrane threshold hyperparameters, we can achieve a 48% reduction in hardware-based inference latency with only 2.88% trade-off in inference accuracy compared to the default setting. Overall, this study highlights the importance of fine-tuning model hyperparameters as crucial for designing efficient SNN hardware accelerators, evidenced by the fine-tuned model achieving a 1.72x improvement in accelerator efficiency (FPS/W) compared to the most recent work.

Via

Access Paper or Ask Questions