Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anirvan M. Sengupta

In-context denoising with one-layer transformers: connections between attention and associative memory retrieval

Feb 07, 2025

Matthew Smart, Alberto Bietti, Anirvan M. Sengupta

Abstract:We introduce in-context denoising, a task that refines the connection between attention-based architectures and dense associative memory (DAM) networks, also known as modern Hopfield networks. Using a Bayesian framework, we show theoretically and empirically that certain restricted denoising problems can be solved optimally even by a single-layer transformer. We demonstrate that a trained attention layer processes each denoising prompt by performing a single gradient descent update on a context-aware DAM energy landscape, where context tokens serve as associative memories and the query token acts as an initial state. This one-step update yields better solutions than exact retrieval of either a context token or a spurious local minimum, providing a concrete example of DAM networks extending beyond the standard retrieval paradigm. Overall, this work solidifies the link between associative memory and attention mechanisms first identified by Ramsauer et al., and demonstrates the relevance of associative memory models in the study of in-context learning.

Via

Access Paper or Ask Questions

Deep Learning Based Superconductivity: Prediction and Experimental Tests

Dec 17, 2024

Daniel Kaplan, Adam Zhang, Joanna Blawat, Rongying Jin, Robert J. Cava, Viktor Oudovenko, Gabriel Kotliar, Anirvan M. Sengupta, Weiwei Xie

Abstract:The discovery of novel superconducting materials is a longstanding challenge in materials science, with a wealth of potential for applications in energy, transportation, and computing. Recent advances in artificial intelligence (AI) have enabled expediting the search for new materials by efficiently utilizing vast materials databases. In this study, we developed an approach based on deep learning (DL) to predict new superconducting materials. We have synthesized a compound derived from our DL network and confirmed its superconducting properties in agreement with our prediction. Our approach is also compared to previous work based on random forests (RFs). In particular, RFs require knowledge of the chem-ical properties of the compound, while our neural net inputs depend solely on the chemical composition. With the help of hints from our network, we discover a new ternary compound $\textrm{Mo}_{20}\textrm{Re}_{6}\textrm{Si}_{4}$, which becomes superconducting below 5.4 K. We further discuss the existing limitations and challenges associated with using AI to predict and, along with potential future research directions.

* 14 pages + 2 appendices + references. EPJ submission

Via

Access Paper or Ask Questions

Unlocking the Potential of Similarity Matching: Scalability, Supervision and Pre-training

Aug 02, 2023

Yanis Bahroun, Shagesh Sridharan, Atithi Acharya, Dmitri B. Chklovskii, Anirvan M. Sengupta

Abstract:While effective, the backpropagation (BP) algorithm exhibits limitations in terms of biological plausibility, computational cost, and suitability for online learning. As a result, there has been a growing interest in developing alternative biologically plausible learning approaches that rely on local learning rules. This study focuses on the primarily unsupervised similarity matching (SM) framework, which aligns with observed mechanisms in biological systems and offers online, localized, and biologically plausible algorithms. i) To scale SM to large datasets, we propose an implementation of Convolutional Nonnegative SM using PyTorch. ii) We introduce a localized supervised SM objective reminiscent of canonical correlation analysis, facilitating stacking SM layers. iii) We leverage the PyTorch implementation for pre-training architectures such as LeNet and compare the evaluation of features against BP-trained models. This work combines biologically plausible algorithms with computational efficiency opening multiple avenues for further explorations.

Via

Access Paper or Ask Questions

A normative framework for deriving neural networks with multi-compartmental neurons and non-Hebbian plasticity

Feb 20, 2023

David Lipshutz, Yanis Bahroun, Siavash Golkar, Anirvan M. Sengupta, Dmitri B. Chklovskii

Abstract:An established normative approach for understanding the algorithmic basis of neural computation is to derive online algorithms from principled computational objectives and evaluate their compatibility with anatomical and physiological observations. Similarity matching objectives have served as successful starting points for deriving online algorithms that map onto neural networks (NNs) with point neurons and Hebbian/anti-Hebbian plasticity. These NN models account for many anatomical and physiological observations; however, the objectives have limited computational power and the derived NNs do not explain multi-compartmental neuronal structures and non-Hebbian forms of plasticity that are prevalent throughout the brain. In this article, we review and unify recent extensions of the similarity matching approach to address more complex objectives, including a broad range of unsupervised and self-supervised learning tasks that can be formulated as generalized eigenvalue problems or nonnegative matrix factorization problems. Interestingly, the online algorithms derived from these objectives naturally map onto NNs with multi-compartmental neurons and local, non-Hebbian learning rules. Therefore, this unified extension of the similarity matching approach provides a normative framework that facilitates understanding the multi-compartmental neuronal structures and non-Hebbian plasticity found throughout the brain.

Via

Access Paper or Ask Questions

Neural optimal feedback control with local learning rules

Nov 12, 2021

Johannes Friedrich, Siavash Golkar, Shiva Farashahi, Alexander Genkin, Anirvan M. Sengupta, Dmitri B. Chklovskii

Figure 1 for Neural optimal feedback control with local learning rules

Figure 2 for Neural optimal feedback control with local learning rules

Figure 3 for Neural optimal feedback control with local learning rules

Figure 4 for Neural optimal feedback control with local learning rules

Abstract:A major problem in motor control is understanding how the brain plans and executes proper movements in the face of delayed and noisy stimuli. A prominent framework for addressing such control problems is Optimal Feedback Control (OFC). OFC generates control actions that optimize behaviorally relevant criteria by integrating noisy sensory stimuli and the predictions of an internal model using the Kalman filter or its extensions. However, a satisfactory neural model of Kalman filtering and control is lacking because existing proposals have the following limitations: not considering the delay of sensory feedback, training in alternating phases, and requiring knowledge of the noise covariance matrices, as well as that of systems dynamics. Moreover, the majority of these studies considered Kalman filtering in isolation, and not jointly with control. To address these shortcomings, we introduce a novel online algorithm which combines adaptive Kalman filtering with a model free control approach (i.e., policy gradient algorithm). We implement this algorithm in a biologically plausible neural network with local synaptic plasticity rules. This network performs system identification and Kalman filtering, without the need for multiple phases with distinct update rules or the knowledge of the noise covariances. It can perform state estimation with delayed sensory feedback, with the help of an internal model. It learns the control policy without requiring any knowledge of the dynamics, thus avoiding the need for weight transport. In this way, our implementation of OFC solves the credit assignment problem needed to produce the appropriate sensory-motor control in the presence of stimulus delay.

* Manuscript and supplementary material of NeurIPS 2021 paper

Via

Access Paper or Ask Questions

A Similarity-preserving Neural Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit

Feb 10, 2021

Yanis Bahroun, Anirvan M. Sengupta, Dmitri B. Chklovskii

Figure 1 for A Similarity-preserving Neural Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit

Figure 2 for A Similarity-preserving Neural Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit

Figure 3 for A Similarity-preserving Neural Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit

Figure 4 for A Similarity-preserving Neural Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit

Abstract:Learning to detect content-independent transformations from data is one of the central problems in biological and artificial intelligence. An example of such problem is unsupervised learning of a visual motion detector from pairs of consecutive video frames. Rao and Ruderman formulated this problem in terms of learning infinitesimal transformation operators (Lie group generators) via minimizing image reconstruction error. Unfortunately, it is difficult to map their model onto a biologically plausible neural network (NN) with local learning rules. Here we propose a biologically plausible model of motion detection. We also adopt the transformation-operator approach but, instead of reconstruction-error minimization, start with a similarity-preserving objective function. An online algorithm that optimizes such an objective function naturally maps onto an NN with biologically plausible learning rules. The trained NN recapitulates major features of the well-studied motion detector in the fly. In particular, it is consistent with the experimental observation that local motion detectors combine information from at least three adjacent pixels, something that contradicts the celebrated Hassenstein-Reichardt model.

* Body and supplementary materials of NeurIPS 2019 paper

Via

Access Paper or Ask Questions

A biologically plausible neural network for local supervision in cortical microcircuits

Nov 30, 2020

Siavash Golkar, David Lipshutz, Yanis Bahroun, Anirvan M. Sengupta, Dmitri B. Chklovskii

Figure 1 for A biologically plausible neural network for local supervision in cortical microcircuits

Figure 2 for A biologically plausible neural network for local supervision in cortical microcircuits

Figure 3 for A biologically plausible neural network for local supervision in cortical microcircuits

Figure 4 for A biologically plausible neural network for local supervision in cortical microcircuits

Abstract:The backpropagation algorithm is an invaluable tool for training artificial neural networks; however, because of a weight sharing requirement, it does not provide a plausible model of brain function. Here, in the context of a two-layer network, we derive an algorithm for training a neural network which avoids this problem by not requiring explicit error computation and backpropagation. Furthermore, our algorithm maps onto a neural network that bears a remarkable resemblance to the connectivity structure and learning rules of the cortex. We find that our algorithm empirically performs comparably to backprop on a number of datasets.

* Abstract presented at the NeurIPS 2020 workshop "Beyond Backpropagation". arXiv admin note: text overlap with arXiv:2010.12660

Via

Access Paper or Ask Questions

A simple normative network approximates local non-Hebbian learning in the cortex

Oct 23, 2020

Siavash Golkar, David Lipshutz, Yanis Bahroun, Anirvan M. Sengupta, Dmitri B. Chklovskii

Figure 1 for A simple normative network approximates local non-Hebbian learning in the cortex

Figure 2 for A simple normative network approximates local non-Hebbian learning in the cortex

Figure 3 for A simple normative network approximates local non-Hebbian learning in the cortex

Figure 4 for A simple normative network approximates local non-Hebbian learning in the cortex

Abstract:To guide behavior, the brain extracts relevant features from high-dimensional data streamed by sensory organs. Neuroscience experiments demonstrate that the processing of sensory inputs by cortical neurons is modulated by instructive signals which provide context and task-relevant information. Here, adopting a normative approach, we model these instructive signals as supervisory inputs guiding the projection of the feedforward data. Mathematically, we start with a family of Reduced-Rank Regression (RRR) objective functions which include Reduced Rank (minimum) Mean Square Error (RRMSE) and Canonical Correlation Analysis (CCA), and derive novel offline and online optimization algorithms, which we call Bio-RRR. The online algorithms can be implemented by neural networks whose synaptic learning rules resemble calcium plateau potential dependent plasticity observed in the cortex. We detail how, in our model, the calcium plateau potential can be interpreted as a backpropagating error signal. We demonstrate that, despite relying exclusively on biologically plausible local learning rules, our algorithms perform competitively with existing implementations of RRMSE and CCA.

* Body and supplementary materials of NeurIPS 2020 paper. 19 pages, 7 figures

Via

Access Paper or Ask Questions

A biologically plausible neural network for multi-channel Canonical Correlation Analysis

Oct 01, 2020

David Lipshutz, Yanis Bahroun, Siavash Golkar, Anirvan M. Sengupta, Dmitri B. Chkovskii

Figure 1 for A biologically plausible neural network for multi-channel Canonical Correlation Analysis

Figure 2 for A biologically plausible neural network for multi-channel Canonical Correlation Analysis

Figure 3 for A biologically plausible neural network for multi-channel Canonical Correlation Analysis

Figure 4 for A biologically plausible neural network for multi-channel Canonical Correlation Analysis

Abstract:Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement Canonical Correlation Analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this end, we seek a multi-channel CCA algorithm that can be implemented in a biologically plausible neural network. For biological plausibility, we require that the network operates in the online setting and its synaptic update rules are local. Starting from a novel CCA objective function, we derive an online optimization algorithm whose optimization steps can be implemented in a single-layer neural network with multi-compartmental neurons and local non-Hebbian learning rules. We also derive an extension of our online CCA algorithm with adaptive output rank and output whitening. Interestingly, the extension maps onto a neural network whose neural architecture and synaptic updates resemble neural circuitry and synaptic plasticity observed experimentally in cortical pyramidal neurons.

* 34 pages, 8 figures

Via

Access Paper or Ask Questions

A Neural Network for Semi-Supervised Learning on Manifolds

Aug 21, 2019

Alexander Genkin, Anirvan M. Sengupta, Dmitri Chklovskii

Figure 1 for A Neural Network for Semi-Supervised Learning on Manifolds

Figure 2 for A Neural Network for Semi-Supervised Learning on Manifolds

Figure 3 for A Neural Network for Semi-Supervised Learning on Manifolds

Figure 4 for A Neural Network for Semi-Supervised Learning on Manifolds

Abstract:Semi-supervised learning algorithms typically construct a weighted graph of data points to represent a manifold. However, an explicit graph representation is problematic for neural networks operating in the online setting. Here, we propose a feed-forward neural network capable of semi-supervised learning on manifolds without using an explicit graph representation. Our algorithm uses channels that represent localities on the manifold such that correlations between channels represent manifold structure. The proposed neural network has two layers. The first layer learns to build a representation of low-dimensional manifolds in the input data as proposed recently in [8]. The second learns to classify data using both occasional supervision and similarity of the manifold representation of the data. The channel carrying label information for the second layer is assumed to be "silent" most of the time. Learning in both layers is Hebbian, making our network design biologically plausible. We experimentally demonstrate the effect of semi-supervised learning on non-trivial manifolds.

* 12 pages, 4 figures, accepted in ICANN 2019

Via

Access Paper or Ask Questions