Abstract:The forecasting skill of numerical weather prediction (NWP) models critically depends on the accurate initial conditions, also known as analysis, provided by data assimilation (DA). Traditional DA methods often face a trade-off between computational cost and accuracy due to complex linear algebra computations and the high dimensionality of the model, especially in nonlinear systems. Moreover, processing massive data in real-time requires substantial computational resources. To address this, we introduce an artificial intelligence-based data assimilation framework (ADAF) to generate high-quality kilometer-scale analysis. This study is the pioneering work using real-world observations from varied locations and multiple sources to verify the AI method's efficacy in DA, including sparse surface weather observations and satellite imagery. We implemented ADAF for four near-surface variables in the Contiguous United States (CONUS). The results indicate that ADAF surpasses the High Resolution Rapid Refresh Data Assimilation System (HRRRDAS) in accuracy by 16% to 33% for near-surface atmospheric conditions, aligning more closely with actual observations, and can effectively reconstruct extreme events, such as tropical cyclone wind fields. Sensitivity experiments reveal that ADAF can generate high-quality analysis even with low-accuracy backgrounds and extremely sparse surface observations. ADAF can assimilate massive observations within a three-hour window at low computational cost, taking about two seconds on an AMD MI200 graphics processing unit (GPU). ADAF has been shown to be efficient and effective in real-world DA, underscoring its potential role in operational weather forecasting.
Abstract:In recent years, AI-based weather forecasting models have matched or even outperformed numerical weather prediction systems. However, most of these models have been trained and evaluated on reanalysis datasets like ERA5. These datasets, being products of numerical models, often diverge substantially from actual observations in some crucial variables like near-surface temperature, wind, precipitation and clouds - parameters that hold significant public interest. To address this divergence, we introduce WeatherReal, a novel benchmark dataset for weather forecasting, derived from global near-surface in-situ observations. WeatherReal also features a publicly accessible quality control and evaluation framework. This paper details the sources and processing methodologies underlying the dataset, and further illustrates the advantage of in-situ observations in capturing hyper-local and extreme weather through comparative analyses and case studies. Using WeatherReal, we evaluated several data-driven models and compared them with leading numerical models. Our work aims to advance the AI-based weather forecasting research towards a more application-focused and operation-ready approach.
Abstract:Accurate and timely rain prediction is crucial for decision making and is also a challenging task. This paper presents a solution which won the 2 nd prize in the Weather4cast 2022 NeurIPS competition using 3D U-Nets and EarthFormers for 8-hour probabilistic rain prediction based on multi-band satellite images. The spatial context effect of the input satellite image has been deeply explored and optimal context range has been found. Based on the imbalanced rain distribution, we trained multiple models with different loss functions. To further improve the model performance, multi-model ensemble and threshold optimization were used to produce the final probabilistic rain prediction. Experiment results and leaderboard scores demonstrate that optimal spatial context, combined loss function, multi-model ensemble, and threshold optimization all provide modest model gain. A permutation test was used to analyze the effect of each satellite band on rain prediction, and results show that satellite bands signifying cloudtop phase (8.7 um) and cloud-top height (10.8 and 13.4 um) are the best predictors for rain prediction. The source code is available at https://github.com/bugsuse/weather4cast-2022-stage2.