Abstract:Progress in hybrid physics-machine learning (ML) climate simulations has been limited by the difficulty of obtaining performant coupled (i.e. online) simulations. While evaluating hundreds of ML parameterizations of subgrid closures (here of convection and radiation) offline is straightforward, online evaluation at the same scale is technically challenging. Our software automation achieves an order-of-magnitude larger sampling of online modeling errors than has previously been examined. Using this, we evaluate the hybrid climate model performance and define strategies to improve it. We show that model online performance improves when incorporating memory, a relative humidity input feature transformation, and additional input variables. We also reveal substantial variation in online error and inconsistencies between offline vs. online error statistics. The implication is that hundreds of candidate ML models should be evaluated online to detect the effects of parameterization design choices. This is considerably more sampling than tends to be reported in the current literature.
Abstract:Global hydrological and land surface models are increasingly used for tracking terrestrial total water storage (TWS) dynamics, but the utility of existing models is hampered by conceptual and/or data uncertainties related to various underrepresented and unrepresented processes, such as groundwater storage. The gravity recovery and climate experiment (GRACE) satellite mission provided a valuable independent data source for tracking TWS at regional and continental scales. Strong interests exist in fusing GRACE data into global hydrological models to improve their predictive performance. Here we develop and apply deep convolutional neural network (CNN) models to learn the spatiotemporal patterns of mismatch between TWS anomalies (TWSA) derived from GRACE and those simulated by NOAH, a widely used land surface model. Once trained, our CNN models can be used to correct the NOAH simulated TWSA without requiring GRACE data, potentially filling the data gap between GRACE and its follow-on mission, GRACE-FO. Our methodology is demonstrated over India, which has experienced significant groundwater depletion in recent decades that is nevertheless not being captured by the NOAH model. Results show that the CNN models significantly improve the match with GRACE TWSA, achieving a country-average correlation coefficient of 0.94 and Nash-Sutcliff efficient of 0.87, or 14\% and 52\% improvement respectively over the original NOAH TWSA. At the local scale, the learned mismatch pattern correlates well with the observed in situ groundwater storage anomaly data for most parts of India, suggesting that deep learning models effectively compensate for the missing groundwater component in NOAH for this study region.