Picture for Nicolas Zucchet

Nicolas Zucchet

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Add code
May 31, 2024
Viaarxiv icon

Uncovering mesa-optimization algorithms in Transformers

Add code
Sep 11, 2023
Figure 1 for Uncovering mesa-optimization algorithms in Transformers
Figure 2 for Uncovering mesa-optimization algorithms in Transformers
Figure 3 for Uncovering mesa-optimization algorithms in Transformers
Figure 4 for Uncovering mesa-optimization algorithms in Transformers
Viaarxiv icon

Gated recurrent neural networks discover attention

Add code
Sep 04, 2023
Viaarxiv icon

Online learning of long-range dependencies

Add code
May 25, 2023
Viaarxiv icon

Random initialisations performing above chance and how to find them

Add code
Sep 15, 2022
Figure 1 for Random initialisations performing above chance and how to find them
Figure 2 for Random initialisations performing above chance and how to find them
Figure 3 for Random initialisations performing above chance and how to find them
Figure 4 for Random initialisations performing above chance and how to find them
Viaarxiv icon

The least-control principle for learning at equilibrium

Add code
Jul 04, 2022
Figure 1 for The least-control principle for learning at equilibrium
Figure 2 for The least-control principle for learning at equilibrium
Figure 3 for The least-control principle for learning at equilibrium
Figure 4 for The least-control principle for learning at equilibrium
Viaarxiv icon

Beyond backpropagation: implicit gradients for bilevel optimization

Add code
May 06, 2022
Figure 1 for Beyond backpropagation: implicit gradients for bilevel optimization
Figure 2 for Beyond backpropagation: implicit gradients for bilevel optimization
Figure 3 for Beyond backpropagation: implicit gradients for bilevel optimization
Viaarxiv icon

Learning where to learn: Gradient sparsity in meta and continual learning

Add code
Oct 27, 2021
Figure 1 for Learning where to learn: Gradient sparsity in meta and continual learning
Figure 2 for Learning where to learn: Gradient sparsity in meta and continual learning
Figure 3 for Learning where to learn: Gradient sparsity in meta and continual learning
Figure 4 for Learning where to learn: Gradient sparsity in meta and continual learning
Viaarxiv icon

A contrastive rule for meta-learning

Add code
Apr 19, 2021
Figure 1 for A contrastive rule for meta-learning
Figure 2 for A contrastive rule for meta-learning
Figure 3 for A contrastive rule for meta-learning
Figure 4 for A contrastive rule for meta-learning
Viaarxiv icon