Abstract:Spiking neural networks drawing inspiration from biological constraints of the brain promise an energy-efficient paradigm for artificial intelligence. However, challenges exist in identifying guiding principles to train these networks in a robust fashion. In addition, training becomes an even more difficult problem when incorporating biological constraints of excitatory and inhibitory connections. In this work, we identify several key factors, such as low initial firing rates and diverse inhibitory spiking patterns, that determine the overall ability to train spiking networks with various ratios of excitatory to inhibitory neurons on AI-relevant datasets. The results indicate networks with the biologically realistic 80:20 excitatory:inhibitory balance can reliably train at low activity levels and in noisy environments. Additionally, the Van Rossum distance, a measure of spike train synchrony, provides insight into the importance of inhibitory neurons to increase network robustness to noise. This work supports further biologically-informed large-scale networks and energy efficient hardware implementations.
Abstract:Continual learning is considered a promising step towards next-generation Artificial Intelligence (AI), where deep neural networks (DNNs) make decisions by continuously learning a sequence of different tasks akin to human learning processes. It is still quite primitive, with existing works focusing primarily on avoiding (catastrophic) forgetting. However, since forgetting is inevitable given bounded memory and unbounded task loads, 'how to reasonably forget' is a problem continual learning must address in order to reduce the performance gap between AIs and humans, in terms of 1) memory efficiency, 2) generalizability, and 3) robustness when dealing with noisy data. To address this, we propose a novel ScheMAtic memory peRsistence and Transience (SMART) framework for continual learning with external memory that builds on recent advances in neuroscience. The efficiency and generalizability are enhanced by a novel long-term forgetting mechanism and schematic memory, using sparsity and 'backward positive transfer' constraints with theoretical guarantees on the error bound. Robust enhancement is achieved using a novel short-term forgetting mechanism inspired by background information-gated learning. Finally, an extensive experimental analysis on both benchmark and real-world datasets demonstrates the effectiveness and efficiency of our model.