Abstract:When performing predictions that use Machine Learning (ML), we are mainly interested in performance and interpretability. This generates a natural trade-off, where complex models generally have higher skills but are harder to explain and thus trust. Interpretability is particularly important in the climate community, where we aim at gaining a physical understanding of the underlying phenomena. Even more so when the prediction concerns extreme weather events with high impact on society. In this paper, we perform probabilistic forecasts of extreme heatwaves over France, using a hierarchy of increasingly complex ML models, which allows us to find the best compromise between accuracy and interpretability. More precisely, we use models that range from a global Gaussian Approximation (GA) to deep Convolutional Neural Networks (CNNs), with the intermediate steps of a simple Intrinsically Interpretable Neural Network (IINN) and a model using the Scattering Transform (ScatNet). Our findings reveal that CNNs provide higher accuracy, but their black-box nature severely limits interpretability, even when using state-of-the-art Explainable Artificial Intelligence (XAI) tools. In contrast, ScatNet achieves similar performance to CNNs while providing greater transparency, identifying key scales and patterns in the data that drive predictions. This study underscores the potential of interpretability in ML models for climate science, demonstrating that simpler models can rival the performance of their more complex counterparts, all the while being much easier to understand. This gained interpretability is crucial for building trust in model predictions and uncovering new scientific insights, ultimately advancing our understanding and management of extreme weather events.
Abstract:We present a data-driven emulator, stochastic weather generator (SWG), suitable for estimating probabilities of prolonged heatwaves in France and Scandinavia. This emulator is based on the method of analogs of circulation to which we add temperature and soil moisture as predictor fields. We train the emulator on an intermediate complexity climate model run and show that it is capable of predicting conditional probabilities (forecasting) of heatwaves out of sample. Special attention is payed that this prediction is evaluated using proper score appropriate for rare events. To accelerate the computation of analogs dimensionality reduction techniques are applied and the performance is evaluated. The probabilistic prediction achieved with SWG is compared with the one achieved with Convolutional Neural Network (CNN). With the availability of hundreds of years of training data CNNs perform better at the task of probabilistic prediction. In addition, we show that the SWG emulator trained on 80 years of data is capable of estimating extreme return times of order of thousands of years for heatwaves longer than several days more precisely than the fit based on generalised extreme value distribution. Finally, the quality of its synthetic extreme teleconnection patterns obtained with stochastic weather generator is studied. We showcase two examples of such synthetic teleconnection patterns for heatwaves in France and Scandinavia that compare favorably to the very long climate model control run.
Abstract:Understanding extreme events and their probability is key for the study of climate change impacts, risk assessment, adaptation, and the protection of living beings. In this work we develop a methodology to build forecasting models for extreme heatwaves. These models are based on convolutional neural networks, trained on extremely long 8,000-year climate model outputs. Because the relation between extreme events is intrinsically probabilistic, we emphasise probabilistic forecast and validation. We demonstrate that deep neural networks are suitable for this purpose for long lasting 14-day heatwaves over France, up to 15 days ahead of time for fast dynamical drivers (500 hPa geopotential height fields), and also at much longer lead times for slow physical drivers (soil moisture). The method is easily implemented and versatile. We find that the deep neural network selects extreme heatwaves associated with a North-Hemisphere wavenumber-3 pattern. We find that the 2 meter temperature field does not contain any new useful statistical information for heatwave forecast, when added to the 500 hPa geopotential height and soil moisture fields. The main scientific message is that training deep neural networks for predicting extreme heatwaves occurs in a regime of drastic lack of data. We suggest that this is likely the case for most other applications to large scale atmosphere and climate phenomena. We discuss perspectives for dealing with the lack of data regime, for instance rare event simulations, and how transfer learning may play a role in this latter task.
Abstract:Forecasting the occurrence of heatwaves constitutes a challenging issue, yet of major societal stake, because extreme events are not often observed and (very) costly to simulate from physics-driven numerical models. The present work aims to explore the use of Deep Learning architectures as alternative strategies to predict extreme heatwaves occurrences from a very limited amount of available relevant climate data. This implies addressing issues such as the aggregation of climate data of different natures, the class-size imbalance that is intrinsically associated with rare event prediction, and the potential benefits of transfer learning to address the nested nature of extreme events (naturally included in less extreme ones). Using 1000 years of state-of-the-art PlaSim Planete Simulator Climate Model data, it is shown that Convolutional Neural Network-based Deep Learning frameworks, with large-class undersampling and transfer learning achieve significant performance in forecasting the occurrence of extreme heatwaves, at three different levels of intensity, and as early as 15 days in advance from the restricted observation, for a single time (single snapshoot) of only two spatial fields of climate data, surface temperature and geopotential height.