Abstract:A common challenge in continual learning (CL) is catastrophic forgetting, where the performance on old tasks drops after new, additional tasks are learned. In this paper, we propose a novel framework called ReCL to slow down forgetting in CL. Our framework exploits an implicit bias of gradient-based neural networks due to which these converge to margin maximization points. Such convergence points allow us to reconstruct old data from previous tasks, which we then combine with the current training data. Our framework is flexible and can be applied on top of existing, state-of-the-art CL methods to slow down forgetting. We further demonstrate the performance gain from our framework across a large series of experiments, including different CL scenarios (class incremental, domain incremental, task incremental learning) different datasets (MNIST, CIFAR10), and different network architectures. Across all experiments, we find large performance gains through ReCL. To the best of our knowledge, our framework is the first to address catastrophic forgetting by leveraging models in CL as their own memory buffers.
Abstract:Accurate vegetation models can produce further insights into the complex interaction between vegetation activity and ecosystem processes. Previous research has established that long-term trends and short-term variability of temperature and precipitation affect vegetation activity. Motivated by the recent success of Transformer-based Deep Learning models for medium-range weather forecasting, we adapt the publicly available pre-trained FourCastNet to model vegetation activity while accounting for the short-term dynamics of climate variability. We investigate how the learned global representation of the atmosphere's state can be transferred to model the normalized difference vegetation index (NDVI). Our model globally estimates vegetation activity at a resolution of \SI{0.25}{\degree} while relying only on meteorological data. We demonstrate that leveraging pre-trained weather models improves the NDVI estimates compared to learning an NDVI model from scratch. Additionally, we compare our results to other recent data-driven NDVI modeling approaches from machine learning and ecology literature. We further provide experimental evidence on how much data and training time is necessary to turn FourCastNet into an effective vegetation model. Code and models will be made available upon publication.