Department of Mathematics and Computer Science, Free University of Berlin, Germany, Potsdam Institute for Climate Impact Research, Potsdam, Germany, Department of Mathematics and Global Systems Institute, University of Exeter, Exeter, UK
Abstract:Latent neural stochastic differential equations (SDEs) have recently emerged as a promising approach for learning generative models from stochastic time series data. However, they systematically underestimate the noise level inherent in such data, limiting their ability to capture stochastic dynamics accurately. We investigate this underestimation in detail and propose a straightforward solution: by including an explicit additional noise regularization in the loss function, we are able to learn a model that accurately captures the diffusion component of the data. We demonstrate our results on a conceptual model system that highlights the improved latent neural SDE's capability to model stochastic bistable dynamics.
Abstract:Neural differential equations offer a powerful approach for learning dynamics from data. However, they do not impose known constraints that should be obeyed by the learned model. It is well-known that enforcing constraints in surrogate models can enhance their generalizability and numerical stability. In this paper, we introduce projected neural differential equations (PNDEs), a new method for constraining neural differential equations based on projection of the learned vector field to the tangent space of the constraint manifold. In tests on several challenging examples, including chaotic dynamical systems and state-of-the-art power grid models, PNDEs outperform existing methods while requiring fewer hyperparameters. The proposed approach demonstrates significant potential for enhancing the modeling of constrained dynamical systems, particularly in complex domains where accuracy and reliability are essential.
Abstract:Nonlinear dynamical systems exposed to changing forcing can exhibit catastrophic transitions between alternative and often markedly different states. The phenomenon of critical slowing down (CSD) can be used to anticipate such transitions if caused by a bifurcation and if the change in forcing is slow compared to the internal time scale of the system. However, in many real-world situations, these assumptions are not met and transitions can be triggered because the forcing exceeds a critical rate. For example, given the pace of anthropogenic climate change in comparison to the internal time scales of key Earth system components, such as the polar ice sheets or the Atlantic Meridional Overturning Circulation, such rate-induced tipping poses a severe risk. Moreover, depending on the realisation of random perturbations, some trajectories may transition across an unstable boundary, while others do not, even under the same forcing. CSD-based indicators generally cannot distinguish these cases of noise-induced tipping versus no tipping. This severely limits our ability to assess the risks of tipping, and to predict individual trajectories. To address this, we make a first attempt to develop a deep learning framework to predict transition probabilities of dynamical systems ahead of rate-induced transitions. Our method issues early warnings, as demonstrated on three prototypical systems for rate-induced tipping, subjected to time-varying equilibrium drift and noise perturbations. Exploiting explainable artificial intelligence methods, our framework captures the fingerprints necessary for early detection of rate-induced tipping, even in cases of long lead times. Our findings demonstrate the predictability of rate-induced and noise-induced tipping, advancing our ability to determine safe operating spaces for a broader class of dynamical systems than possible so far.
Abstract:Predicting chaotic dynamical systems is critical in many scientific fields such as weather prediction, but challenging due to the characterizing sensitive dependence on initial conditions. Traditional modeling approaches require extensive domain knowledge, often leading to a shift towards data-driven methods using machine learning. However, existing research provides inconclusive results on which machine learning methods are best suited for predicting chaotic systems. In this paper, we compare different lightweight and heavyweight machine learning architectures using extensive existing databases, as well as a newly introduced one that allows for uncertainty quantification in the benchmark results. We perform hyperparameter tuning based on computational cost and introduce a novel error metric, the cumulative maximum error, which combines several desirable properties of traditional metrics, tailored for chaotic systems. Our results show that well-tuned simple methods, as well as untuned baseline methods, often outperform state-of-the-art deep learning models, but their performance can vary significantly with different experimental setups. These findings underscore the importance of matching prediction methods to data characteristics and available computational resources.
Abstract:Ocean dynamics plays a crucial role in driving global weather and climate patterns. Accurate and efficient modeling of ocean dynamics is essential for improved understanding of complex ocean circulation and processes, for predicting climate variations and their associated teleconnections, and for addressing the challenges of climate change. While great efforts have been made to improve numerical Ocean General Circulation Models (OGCMs), accurate forecasting of global oceanic variations for multi-year remains to be a long-standing challenge. Here, we introduce ORCA (Oceanic Reliable foreCAst), the first data-driven model predicting global ocean circulation from multi-year to decadal time scales. ORCA accurately simulates the three-dimensional circulations and dynamics of the global ocean with high physical consistency. Hindcasts of key oceanic variables demonstrate ORCA's remarkable prediction skills in predicting ocean variations compared with state-of-the-art numerical OGCMs and abilities in capturing occurrences of extreme events at the subsurface ocean and ENSO vertical patterns. These results demonstrate the potential of data-driven ocean models for providing cheap, efficient, and accurate global ocean modeling and prediction. Moreover, ORCA stably and faithfully emulates ocean dynamics at decadal timescales, demonstrating its potential even for climate projections. The model will be available at https://github.com/OpenEarthLab/ORCA.
Abstract:Accurate and high-resolution Earth system model (ESM) simulations are essential to assess the ecological and socio-economic impacts of anthropogenic climate change, but are computationally too expensive. Recent machine learning approaches have shown promising results in downscaling ESM simulations, outperforming state-of-the-art statistical approaches. However, existing methods require computationally costly retraining for each ESM and extrapolate poorly to climates unseen during training. We address these shortcomings by learning a consistency model (CM) that efficiently and accurately downscales arbitrary ESM simulations without retraining in a zero-shot manner. Our foundation model approach yields probabilistic downscaled fields at resolution only limited by the observational reference data. We show that the CM outperforms state-of-the-art diffusion models at a fraction of computational cost while maintaining high controllability on the downscaling task. Further, our method generalizes to climate states unseen during training without explicitly formulated physical constraints.
Abstract:Recent studies have shown that deep learning (DL) models can skillfully predict the El Ni\~no-Southern Oscillation (ENSO) forecasts over 1.5 years ahead. However, concerns regarding the reliability of predictions made by DL methods persist, including potential overfitting issues and lack of interpretability. Here, we propose ResoNet, a DL model that combines convolutional neural network (CNN) and Transformer architectures. This hybrid architecture design enables our model to adequately capture local SSTA as well as long-range inter-basin interactions across oceans. We show that ResoNet can robustly predict ESNO at lead times between 19 and 26 months, thus outperforming existing approaches in terms of the forecast horizon. According to an explainability method applied to ResoNet predictions of El Ni\~no and La Ni\~na events from 1- to 18-month lead, we find that it predicts the Ni\~no3.4 index based on multiple physically reasonable mechanisms, such as the Recharge Oscillator concept, Seasonal Footprint Mechanism, and Indian Ocean capacitor effect. Moreover, we demonstrate that for the first time, the asymmetry between El Ni\~no and La Ni\~na development can be captured by ResoNet. Our results could help alleviate skepticism about applying DL models for ENSO prediction and encourage more attempts to discover and predict climate phenomena using AI methods.
Abstract:Historical records of climate fields are often sparse due to missing measurements, especially before the introduction of large-scale satellite missions. Several statistical and model-based methods have been introduced to fill gaps and reconstruct historical records. Here, we employ a recently introduced deep-learning approach based on Fourier convolutions, trained on numerical climate model output, to reconstruct historical climate fields. Using this approach we are able to realistically reconstruct large and irregular areas of missing data, as well as reconstruct known historical events such as strong El Ni\~no and La Ni\~na with very little given information. Our method outperforms the widely used statistical kriging method as well as other recent machine learning approaches. The model generalizes to higher resolutions than the ones it was trained on and can be used on a variety of climate fields. Moreover, it allows inpainting of masks never seen before during the model training.
Abstract:Precipitation nowcasting (up to a few hours) remains a challenge due to the highly complex local interactions that need to be captured accurately. Convolutional Neural Networks rely on convolutional kernels convolving with grid data and the extracted features are trapped by limited receptive field, typically expressed in excessively smooth output compared to ground truth. Thus they lack the capacity to model complex spatial relationships among the grids. Geometric deep learning aims to generalize neural network models to non-Euclidean domains. Such models are more flexible in defining nodes and edges and can effectively capture dynamic spatial relationship among geographical grids. Motivated by this, we explore a geometric deep learning-based temporal Graph Convolutional Network (GCN) for precipitation nowcasting. The adjacency matrix that simulates the interactions among grid cells is learned automatically by minimizing the L1 loss between prediction and ground truth pixel value during the training procedure. Then, the spatial relationship is refined by GCN layers while the temporal information is extracted by 1D convolution with various kernel lengths. The neighboring information is fed as auxiliary input layers to improve the final result. We test the model on sequences of radar reflectivity maps over the Trento/Italy area. The results show that GCNs improves the effectiveness of modeling the local details of the cloud profile as well as the prediction accuracy by achieving decreased error measures.
Abstract:Many successful methods to learn dynamical systems from data have recently been introduced. However, assuring that the inferred dynamics preserve known constraints, such as conservation laws or restrictions on the allowed system states, remains challenging. We propose stabilized neural differential equations (SNDEs), a method to enforce arbitrary manifold constraints for neural differential equations. Our approach is based on a stabilization term that, when added to the original dynamics, renders the constraint manifold provably asymptotically stable. Due to its simplicity, our method is compatible with all common neural ordinary differential equation (NODE) models and broadly applicable. In extensive empirical evaluations, we demonstrate that SNDEs outperform existing methods while extending the scope of which types of constraints can be incorporated into NODE training.