Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gido M. van de Ven

Baylor College of Medicine

On the Computation of the Fisher Information in Continual Learning

Feb 17, 2025

Gido M. van de Ven

Abstract:One of the most popular methods for continual learning with deep neural networks is Elastic Weight Consolidation (EWC), which involves computing the Fisher Information. The exact way in which the Fisher Information is computed is however rarely described, and multiple different implementations for it can be found online. This blog post discusses and empirically compares several often-used implementations, which highlights that many currently reported results for EWC could likely be improved by changing the way the Fisher Information is computed.

* To appear in the blogpost track at ICLR 2025

Via

Access Paper or Ask Questions

Aligning Generalisation Between Humans and Machines

Nov 23, 2024

Filip Ilievski, Barbara Hammer, Frank van Harmelen, Benjamin Paassen, Sascha Saralajew, Ute Schmid, Michael Biehl, Marianna Bolognesi, Xin Luna Dong, Kiril Gashteovski(+15 more)

Figure 1 for Aligning Generalisation Between Humans and Machines

Figure 2 for Aligning Generalisation Between Humans and Machines

Figure 3 for Aligning Generalisation Between Humans and Machines

Figure 4 for Aligning Generalisation Between Humans and Machines

Abstract:Recent advances in AI -- including generative approaches -- have resulted in technology that can support humans in scientific discovery and decision support but may also disrupt democracies and target individuals. The responsible use of AI increasingly shows the need for human-AI teaming, necessitating effective interaction between humans and machines. A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalise. In cognitive science, human generalisation commonly involves abstraction and concept learning. In contrast, AI generalisation encompasses out-of-domain generalisation in machine learning, rule-based reasoning in symbolic AI, and abstraction in neuro-symbolic AI. In this perspective paper, we combine insights from AI and cognitive science to identify key commonalities and differences across three dimensions: notions of generalisation, methods for generalisation, and evaluation of generalisation. We map the different conceptualisations of generalisation in AI and cognitive science along these three dimensions and consider their role in human-AI teaming. This results in interdisciplinary challenges across AI and cognitive science that must be tackled to provide a foundation for effective and cognitively supported alignment in human-AI teaming scenarios.

Via

Access Paper or Ask Questions

Continual Learning and Catastrophic Forgetting

Mar 08, 2024

Gido M. van de Ven, Nicholas Soures, Dhireesha Kudithipudi

Figure 1 for Continual Learning and Catastrophic Forgetting

Figure 2 for Continual Learning and Catastrophic Forgetting

Figure 3 for Continual Learning and Catastrophic Forgetting

Abstract:This book chapter delves into the dynamics of continual learning, which is the process of incrementally learning from a non-stationary stream of data. Although continual learning is a natural skill for the human brain, it is very challenging for artificial neural networks. An important reason is that, when learning something new, these networks tend to quickly and drastically forget what they had learned before, a phenomenon known as catastrophic forgetting. Especially in the last decade, continual learning has become an extensively studied topic in deep learning. This book chapter reviews the insights that this field has generated.

* Preprint of a book chapter; 21 pages, 4 figures

Via

Access Paper or Ask Questions

Disentangled Continual Learning: Separating Memory Edits from Model Updates

Dec 27, 2023

Sebastian Dziadzio, Çağatay Yıldız, Gido M. van de Ven, Tomasz Trzciński, Tinne Tuytelaars, Matthias Bethge

Figure 1 for Disentangled Continual Learning: Separating Memory Edits from Model Updates

Figure 2 for Disentangled Continual Learning: Separating Memory Edits from Model Updates

Figure 3 for Disentangled Continual Learning: Separating Memory Edits from Model Updates

Figure 4 for Disentangled Continual Learning: Separating Memory Edits from Model Updates

Abstract:The ability of machine learning systems to learn continually is hindered by catastrophic forgetting, the tendency of neural networks to overwrite existing knowledge when learning a new task. Existing continual learning methods alleviate this problem through regularisation, parameter isolation, or rehearsal, and are typically evaluated on benchmarks consisting of a handful of tasks. We propose a novel conceptual approach to continual classification that aims to disentangle class-specific information that needs to be memorised from the class-agnostic knowledge that encapsulates generalization. We store the former in a buffer that can be easily pruned or updated when new categories arrive, while the latter is represented with a neural network that generalizes across tasks. We show that the class-agnostic network does not suffer from catastrophic forgetting and by leveraging it to perform classification, we improve accuracy on past tasks over time. In addition, our approach supports open-set classification and one-shot generalization. To test our conceptual framework, we introduce Infinite dSprites, a tool for creating continual classification and disentanglement benchmarks of arbitrary length with full control over generative factors. We show that over a sufficiently long time horizon all major types of continual learning methods break down, while our approach enables continual learning over hundreds of tasks with explicit control over memorization and forgetting.

* 12 pages, 11 figures

Via

Access Paper or Ask Questions

Continual Learning of Diffusion Models with Generative Distillation

Nov 23, 2023

Sergi Masip, Pau Rodriguez, Tinne Tuytelaars, Gido M. van de Ven

Abstract:Diffusion models are powerful generative models that achieve state-of-the-art performance in tasks such as image synthesis. However, training them demands substantial amounts of data and computational resources. Continual learning would allow for incrementally learning new tasks and accumulating knowledge, thus reusing already trained models would be possible. One potentially suitable approach is generative replay, where a copy of a generative model trained on previous tasks produces synthetic data that are interleaved with data from the current task. However, standard generative replay applied to diffusion models results in a catastrophic loss in denoising capabilities. In this paper, we propose generative distillation, an approach that distils the entire reverse process of a diffusion model. We demonstrate that our approach significantly improves the continual learning performance of generative replay with only a moderate increase in the computational costs.

Via

Access Paper or Ask Questions

Continual Learning: Applications and the Road Forward

Nov 21, 2023

Eli Verwimp, Rahaf Aljundi, Shai Ben-David, Matthias Bethge, Andrea Cossu, Alexander Gepperth, Tyler L. Hayes, Eyke Hüllermeier, Christopher Kanan, Dhireesha Kudithipudi(+10 more)

Figure 1 for Continual Learning: Applications and the Road Forward

Figure 2 for Continual Learning: Applications and the Road Forward

Figure 3 for Continual Learning: Applications and the Road Forward

Abstract:Continual learning is a sub-field of machine learning, which aims to allow machine learning models to continuously learn on new data, by accumulating knowledge without forgetting what was learned in the past. In this work, we take a step back, and ask: "Why should one care about continual learning in the first place?". We set the stage by surveying recent continual learning papers published at three major machine learning conferences, and show that memory-constrained settings dominate the field. Then, we discuss five open problems in machine learning, and even though they seem unrelated to continual learning at first sight, we show that continual learning will inevitably be part of their solution. These problems are model-editing, personalization, on-device learning, faster (re-)training and reinforcement learning. Finally, by comparing the desiderata from these unsolved problems and the current assumptions in continual learning, we highlight and discuss four future directions for continual learning research. We hope that this work offers an interesting perspective on the future of continual learning, while displaying its potential value and the paths we have to pursue in order to make it successful. This work is the result of the many discussions the authors had at the Dagstuhl seminar on Deep Continual Learning, in March 2023.

Via

Access Paper or Ask Questions

Two Complementary Perspectives to Continual Learning: Ask Not Only What to Optimize, But Also How

Nov 08, 2023

Timm Hess, Tinne Tuytelaars, Gido M. van de Ven

Abstract:Recent years have seen considerable progress in the continual training of deep neural networks, predominantly thanks to approaches that add replay or regularization terms to the loss function to approximate the joint loss over all tasks so far. However, we show that even with a perfect approximation to the joint loss, these approaches still suffer from temporary but substantial forgetting when starting to train on a new task. Motivated by this 'stability gap', we propose that continual learning strategies should focus not only on the optimization objective, but also on the way this objective is optimized. While there is some continual learning work that alters the optimization trajectory (e.g., using gradient projection techniques), this line of research is positioned as alternative to improving the optimization objective, while we argue it should be complementary. To evaluate the merits of our proposition, we plan to combine replay-approximated joint objectives with gradient projection-based optimization routines to test whether the addition of the latter provides benefits in terms of (1) alleviating the stability gap, (2) increasing the learning efficiency and (3) improving the final learning outcome.

* Pre-registered report, accepted at the 1st ContinualAI Unconference. Full paper with the results of the proposed experiment is expected to follow by June 2024

Via

Access Paper or Ask Questions

Prediction Error-based Classification for Class-Incremental Learning

May 30, 2023

Michał Zając, Tinne Tuytelaars, Gido M. van de Ven

Abstract:Class-incremental learning (CIL) is a particularly challenging variant of continual learning, where the goal is to learn to discriminate between all classes presented in an incremental fashion. Existing approaches often suffer from excessive forgetting and imbalance of the scores assigned to classes that have not been seen together during training. In this study, we introduce a novel approach, Prediction Error-based Classification (PEC), which differs from traditional discriminative and generative classification paradigms. PEC computes a class score by measuring the prediction error of a model trained to replicate the outputs of a frozen random neural network on data from that class. The method can be interpreted as approximating a classification rule based on Gaussian Process posterior variance. PEC offers several practical advantages, including sample efficiency, ease of tuning, and effectiveness even when data are presented one class at a time. Our empirical results show that PEC performs strongly in single-pass-through-data CIL, outperforming other rehearsal-free baselines in all cases and rehearsal-based methods with moderate replay buffer size in most cases across multiple benchmarks.

Via

Access Paper or Ask Questions

Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting

Apr 03, 2023

Timm Hess, Eli Verwimp, Gido M. van de Ven, Tinne Tuytelaars

Figure 1 for Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting

Figure 2 for Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting

Figure 3 for Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting

Figure 4 for Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting

Abstract:By default, neural networks learn on all training data at once. When such a model is trained on sequential chunks of new data, it tends to catastrophically forget how to handle old data. In this work we investigate how continual learners learn and forget representations. We observe two phenomena: knowledge accumulation, i.e. the improvement of a representation over time, and feature forgetting, i.e. the loss of task-specific representations. To better understand both phenomena, we introduce a new analysis technique called task exclusion comparison. If a model has seen a task and it has not forgotten all the task-specific features, then its representation for that task should be better than that of a model that was trained on similar tasks, but not that exact one. Our image classification experiments show that most task-specific features are quickly forgotten, in contrast to what has been suggested in the past. Further, we demonstrate how some continual learning methods, like replay, and ideas from representation learning affect a continually learned representation. We conclude by observing that representation quality is tightly correlated with continual learning performance.

Via

Access Paper or Ask Questions

Class-Incremental Learning with Generative Classifiers

Apr 28, 2021

Gido M. van de Ven, Zhe Li, Andreas S. Tolias

Figure 1 for Class-Incremental Learning with Generative Classifiers

Figure 2 for Class-Incremental Learning with Generative Classifiers

Figure 3 for Class-Incremental Learning with Generative Classifiers

Figure 4 for Class-Incremental Learning with Generative Classifiers

Abstract:Incrementally training deep neural networks to recognize new classes is a challenging problem. Most existing class-incremental learning methods store data or use generative replay, both of which have drawbacks, while 'rehearsal-free' alternatives such as parameter regularization or bias-correction methods do not consistently achieve high performance. Here, we put forward a new strategy for class-incremental learning: generative classification. Rather than directly learning the conditional distribution p(y|x), our proposal is to learn the joint distribution p(x,y), factorized as p(x|y)p(y), and to perform classification using Bayes' rule. As a proof-of-principle, here we implement this strategy by training a variational autoencoder for each class to be learned and by using importance sampling to estimate the likelihoods p(x|y). This simple approach performs very well on a diverse set of continual learning benchmarks, outperforming generative replay and other existing baselines that do not store data.

* To appear in the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPR-W) on Continual Learning in Computer Vision (CLVision) 2021

Via

Access Paper or Ask Questions