Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thomas Miconi

Thinking agents for zero-shot generalization to qualitatively novel tasks

Mar 25, 2025

Thomas Miconi, Kevin McKee, Yicong Zheng, Jed McCaleb

Abstract:Intelligent organisms can solve truly novel problems which they have never encountered before, either in their lifetime or their evolution. An important component of this capacity is the ability to ``think'', that is, to mentally manipulate objects, concepts and behaviors in order to plan and evaluate possible solutions to novel problems, even without environment interaction. To generate problems that are truly qualitatively novel, while still solvable zero-shot (by mental simulation), we use the combinatorial nature of environments: we train the agent while withholding a specific combination of the environment's elements. The novel test task, based on this combination, is thus guaranteed to be truly novel, while still mentally simulable since the agent has been exposed to each individual element (and their pairwise interactions) during training. We propose a method to train agents endowed with world models to make use their mental simulation abilities, by selecting tasks based on the difference between the agent's pre-thinking and post-thinking performance. When tested on the novel, withheld problem, the resulting agent successfully simulated alternative scenarios and used the resulting information to guide its behavior in the actual environment, solving the novel task in a single real-environment trial (zero-shot).

Via

Access Paper or Ask Questions

Brain-inspired learning in artificial neural networks: a review

May 18, 2023

Samuel Schmidgall, Jascha Achterberg, Thomas Miconi, Louis Kirsch, Rojin Ziaei, S. Pardis Hajiseyedrazi, Jason Eshraghian

Abstract:Artificial neural networks (ANNs) have emerged as an essential tool in machine learning, achieving remarkable success across diverse domains, including image and speech generation, game playing, and robotics. However, there exist fundamental differences between ANNs' operating mechanisms and those of the biological brain, particularly concerning learning processes. This paper presents a comprehensive review of current brain-inspired learning representations in artificial neural networks. We investigate the integration of more biologically plausible mechanisms, such as synaptic plasticity, to enhance these networks' capabilities. Moreover, we delve into the potential advantages and challenges accompanying this approach. Ultimately, we pinpoint promising avenues for future research in this rapidly advancing field, which could bring us closer to understanding the essence of intelligence.

Via

Access Paper or Ask Questions

A large parametrized space of meta-reinforcement learning tasks

Feb 11, 2023

Thomas Miconi

Abstract:We describe a parametrized space for simple meta-reinforcement-learning (meta-RL) tasks with arbitrary stimuli. The parametrization allows us to randomly generate an arbitrary number of novel simple meta-learning tasks. The space of meta-RL tasks covered by this parametrization includes many well-known meta-RL tasks, such as bandit tasks, the Harlow task, T-mazes, the Daw two-step task and others. Simple extensions allow it to capture tasks based on two-dimensional topological spaces, such as find-the-spot or key-door tasks. We describe a number of randomly generated meta-RL tasks and discuss potential issues arising from random generation.

Via

Access Paper or Ask Questions

Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Jan 17, 2022

Thomas Miconi

Figure 1 for Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Figure 2 for Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Figure 3 for Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning

Abstract:In meta-learning, networks are trained with external algorithms to learn tasks that require acquiring, storing and exploiting unpredictable information for each new instance of the task. However, animals are able to pick up such cognitive tasks automatically, as a result of their evolved neural architecture and synaptic plasticity mechanisms. Here we evolve neural networks, endowed with plastic connections, over a sizeable set of simple meta-learning tasks based on a framework from computational neuroscience. The resulting evolved network can automatically acquire a novel simple cognitive task, never seen during training, through the spontaneous operation of its evolved neural organization and plasticity structure. We suggest that attending to the multiplicity of loops involved in natural learning may provide useful insight into the emergence of intelligent behavior

Via

Access Paper or Ask Questions

Multi-layer Hebbian networks with modern deep learning frameworks

Jul 04, 2021

Thomas Miconi

Figure 1 for Multi-layer Hebbian networks with modern deep learning frameworks

Figure 2 for Multi-layer Hebbian networks with modern deep learning frameworks

Figure 3 for Multi-layer Hebbian networks with modern deep learning frameworks

Figure 4 for Multi-layer Hebbian networks with modern deep learning frameworks

Abstract:Deep learning networks generally use non-biological learning methods. By contrast, networks based on more biologically plausible learning, such as Hebbian learning, show comparatively poor performance and difficulties of implementation. Here we show that hierarchical, convolutional Hebbian learning can be implemented almost trivially with modern deep learning frameworks, by using specific losses whose gradients produce exactly the desired Hebbian updates. We provide expressions whose gradients exactly implement a plain Hebbian rule (dw ~= xy), Grossberg's instar rule (dw ~= y(x-w)), and Oja's rule (dw ~= y(x-yw)). As an application, we build Hebbian convolutional multi-layer networks for object recognition. We observe that higher layers of such networks tend to learn large, simple features (Gabor-like filters and blobs), explaining the previously reported decrease in decoding performance over successive layers. To combat this tendency, we introduce interventions (denser activations with sparse plasticity, pruning of connections between layers) which result in sparser learned features, massively increase performance, and allow information to increase over successive layers. We hypothesize that more advanced techniques (dynamic stimuli, trace learning, feedback connections, etc.), together with the massive computational boost offered by modern deep learning frameworks, could greatly improve the performance and biological relevance of multi-layer Hebbian networks.

* All code available at https://github.com/ThomasMiconi/HebbianCNNPyTorch

Via

Access Paper or Ask Questions

Enabling Continual Learning with Differentiable Hebbian Plasticity

Jun 30, 2020

Vithursan Thangarasa, Thomas Miconi, Graham W. Taylor

Figure 1 for Enabling Continual Learning with Differentiable Hebbian Plasticity

Figure 2 for Enabling Continual Learning with Differentiable Hebbian Plasticity

Figure 3 for Enabling Continual Learning with Differentiable Hebbian Plasticity

Figure 4 for Enabling Continual Learning with Differentiable Hebbian Plasticity

Abstract:Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge. However, catastrophic forgetting poses a grand challenge for neural networks performing such learning process. Thus, neural networks that are deployed in the real world often struggle in scenarios where the data distribution is non-stationary (concept drift), imbalanced, or not always fully available, i.e., rare edge cases. We propose a Differentiable Hebbian Consolidation model which is composed of a Differentiable Hebbian Plasticity (DHP) Softmax layer that adds a rapid learning plastic component (compressed episodic memory) to the fixed (slow changing) parameters of the softmax output layer; enabling learned representations to be retained for a longer timescale. We demonstrate the flexibility of our method by integrating well-known task-specific synaptic consolidation methods to penalize changes in the slow weights that are important for each target task. We evaluate our approach on the Permuted MNIST, Split MNIST and Vision Datasets Mixture benchmarks, and introduce an imbalanced variant of Permuted MNIST -- a dataset that combines the challenges of class imbalance and concept drift. Our proposed model requires no additional hyperparameters and outperforms comparable baselines by reducing forgetting.

* Published as a conference paper at IJCNN 2020

Via

Access Paper or Ask Questions

Learning to Continually Learn

Mar 04, 2020

Shawn Beaulieu, Lapo Frati, Thomas Miconi, Joel Lehman, Kenneth O. Stanley, Jeff Clune, Nick Cheney

Figure 1 for Learning to Continually Learn

Figure 2 for Learning to Continually Learn

Figure 3 for Learning to Continually Learn

Figure 4 for Learning to Continually Learn

Abstract:Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it. Much work has gone towards preventing the default tendency of machine learning models to catastrophically forget, yet virtually all such work involves manually-designed solutions to the problem. We instead advocate meta-learning a solution to catastrophic forgetting, allowing AI to learn to continually learn. Inspired by neuromodulatory processes in the brain, we propose A Neuromodulated Meta-Learning Algorithm (ANML). It differentiates through a sequential learning process to meta-learn an activation-gating function that enables context-dependent selective activation within a deep neural network. Specifically, a neuromodulatory (NM) neural network gates the forward pass of another (otherwise normal) neural network called the prediction learning network (PLN). The NM network also thus indirectly controls selective plasticity (i.e. the backward pass of) the PLN. ANML enables continual learning without catastrophic forgetting at scale: it produces state-of-the-art continual learning performance, sequentially learning as many as 600 classes (over 9,000 SGD updates).

Via

Access Paper or Ask Questions

Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

Feb 24, 2020

Thomas Miconi, Aditya Rawal, Jeff Clune, Kenneth O. Stanley

Figure 1 for Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

Figure 2 for Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

Figure 3 for Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

Figure 4 for Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

Abstract:The impressive lifelong learning in animal brains is primarily enabled by plastic changes in synaptic connectivity. Importantly, these changes are not passive, but are actively controlled by neuromodulation, which is itself under the control of the brain. The resulting self-modifying abilities of the brain play an important role in learning and adaptation, and are a major basis for biological reinforcement learning. Here we show for the first time that artificial neural networks with such neuromodulated plasticity can be trained with gradient descent. Extending previous work on differentiable Hebbian plasticity, we propose a differentiable formulation for the neuromodulation of plasticity. We show that neuromodulated plasticity improves the performance of neural networks on both reinforcement learning and supervised learning tasks. In one task, neuromodulated plastic LSTMs with millions of parameters outperform standard LSTMs on a benchmark language modeling task (controlling for the number of parameters). We conclude that differentiable neuromodulation of plasticity offers a powerful new framework for training neural networks.

* 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019
* Presented at the 7th International Conference on Learning Representations (ICLR 2019)

Via

Access Paper or Ask Questions

Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

Feb 21, 2020

Ashley D. Edwards, Himanshu Sahni, Rosanne Liu, Jane Hung, Ankit Jain, Rui Wang, Adrien Ecoffet, Thomas Miconi, Charles Isbell, Jason Yosinski

Figure 1 for Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

Figure 2 for Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

Figure 3 for Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

Figure 4 for Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

Abstract:In this paper, we introduce a novel form of value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still learning off-policy. We highlight the benefits of this approach in terms of value function transfer, learning within redundant action spaces, and learning off-policy from state observations generated by sub-optimal or completely random policies. Code and videos are available at \url{sites.google.com/view/qss-paper}.

Via

Access Paper or Ask Questions

First-Order Preconditioning via Hypergradient Descent

Oct 18, 2019

Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

Figure 1 for First-Order Preconditioning via Hypergradient Descent

Figure 2 for First-Order Preconditioning via Hypergradient Descent

Figure 3 for First-Order Preconditioning via Hypergradient Descent

Figure 4 for First-Order Preconditioning via Hypergradient Descent

Abstract:Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a preconditioning matrix to the gradient to improve convergence. Unfortunately, such algorithms typically struggle to scale to high-dimensional problems, in part because the calculation of specific preconditioners such as the inverse Hessian or Fisher information matrix is highly expensive. We introduce first-order preconditioning (FOP), a fast, scalable approach that generalizes previous work on hypergradient descent (Almeida et al., 1998; Maclaurin et al., 2015; Baydin et al., 2017) to learn a preconditioning matrix that only makes use of first-order information. Experiments show that FOP is able to improve the performance of standard deep learning optimizers on several visual classification tasks with minimal computational overhead. We also investigate the properties of the learned preconditioning matrices and perform a preliminary theoretical analysis of the algorithm.

Via

Access Paper or Ask Questions