Abstract:In our contemporary era, meteorological weather forecasts increasingly incorporate ensemble predictions of visibility - a parameter of great importance in aviation, maritime navigation, and air quality assessment, with direct implications for public health. However, this weather variable falls short of the predictive accuracy achieved for other quantities issued by meteorological centers. Therefore, statistical post-processing is recommended to enhance the reliability and accuracy of predictions. By estimating the predictive distributions of the variables with the aid of historical observations and forecasts, one can achieve statistical consistency between true observations and ensemble predictions. Visibility observations, following the recommendation of the World Meteorological Organization, are typically reported in discrete values; hence, the predictive distribution of the weather quantity takes the form of a discrete parametric law. Recent studies demonstrated that the application of classification algorithms can successfully improve the skill of such discrete forecasts; however, a frequently emerging issue is that certain spatial and/or temporal dependencies could be lost between marginals. Based on visibility ensemble forecasts of the European Centre for Medium-Range Weather Forecasts for 30 locations in Central Europe, we investigate whether the inclusion of Copernicus Atmosphere Monitoring Service (CAMS) predictions of the same weather quantity as an additional covariate could enhance the skill of the post-processing methods and whether it contributes to the successful integration of spatial dependence between marginals. Our study confirms that post-processed forecasts are substantially superior to raw and climatological predictions, and the utilization of CAMS forecasts provides a further significant enhancement both in the univariate and multivariate setup.
Abstract:To be able to produce accurate and reliable predictions of visibility has crucial importance in aviation meteorology, as well as in water- and road transportation. Nowadays, several meteorological services provide ensemble forecasts of visibility; however, the skill, and reliability of visibility predictions are far reduced compared to other variables, such as temperature or wind speed. Hence, some form of calibration is strongly advised, which usually means estimation of the predictive distribution of the weather quantity at hand either by parametric or non-parametric approaches, including also machine learning-based techniques. As visibility observations - according to the suggestion of the World Meteorological Organization - are usually reported in discrete values, the predictive distribution for this particular variable is a discrete probability law, hence calibration can be reduced to a classification problem. Based on visibility ensemble forecasts of the European Centre for Medium-Range Weather Forecasts covering two slightly overlapping domains in Central and Western Europe and two different time periods, we investigate the predictive performance of locally, semi-locally and regionally trained proportional odds logistic regression (POLR) and multilayer perceptron (MLP) neural network classifiers. We show that while climatological forecasts outperform the raw ensemble by a wide margin, post-processing results in further substantial improvement in forecast skill and in general, POLR models are superior to their MLP counterparts.
Abstract:By the end of 2021, the renewable energy share of the global electricity capacity reached 38.3% and the new installations are dominated by wind and solar energy, showing global increases of 12.7% and 18.5%, respectively. However, both wind and photovoltaic energy sources are highly volatile making planning difficult for grid operators, so accurate forecasts of the corresponding weather variables are essential for reliable electricity predictions. The most advanced approach in weather prediction is the ensemble method, which opens the door for probabilistic forecasting; though ensemble forecast are often underdispersive and subject to systematic bias. Hence, they require some form of statistical post-processing, where parametric models provide full predictive distributions of the weather variables at hand. We propose a general two-step machine learning-based approach to calibrating ensemble weather forecasts, where in the first step improved point forecasts are generated, which are then together with various ensemble statistics serve as input features of the neural network estimating the parameters of the predictive distribution. In two case studies based of 100m wind speed and global horizontal irradiance forecasts of the operational ensemble pre diction system of the Hungarian Meteorological Service, the predictive performance of this novel method is compared with the forecast skill of the raw ensemble and the state-of-the-art parametric approaches. Both case studies confirm that at least up to 48h statistical post-processing substantially improves the predictive performance of the raw ensemble for all considered forecast horizons. The investigated variants of the proposed two-step method outperform in skill their competitors and the suggested new approach is well applicable for different weather quantities and for a fair range of predictive distributions.
Abstract:In the last decades wind power became the second largest energy source in the EU covering 16% of its electricity demand. However, due to its volatility, accurate short range wind power predictions are required for successful integration of wind energy into the electrical grid. Accurate predictions of wind power require accurate hub height wind speed forecasts, where the state of the art method is the probabilistic approach based on ensemble forecasts obtained from multiple runs of numerical weather prediction models. Nonetheless, ensemble forecasts are often uncalibrated and might also be biased, thus require some form of post-processing to improve their predictive performance. We propose a novel flexible machine learning approach for calibrating wind speed ensemble forecasts, which results in a truncated normal predictive distribution. In a case study based on 100m wind speed forecasts produced by the operational ensemble prediction system of the Hungarian Meteorological Service, the forecast skill of this method is compared with the predictive performance of three different ensemble model output statistics approaches and the raw ensemble forecasts. We show that compared with the raw ensemble, post-processing always improves the calibration of probabilistic and accuracy of point forecasts and from the four competing methods the novel machine learning based approach results in the best overall performance.
Abstract:Accurate and reliable forecasting of total cloud cover (TCC) is vital for many areas such as astronomy, energy demand and production, or agriculture. Most meteorological centres issue ensemble forecasts of TCC, however, these forecasts are often uncalibrated and exhibit worse forecast skill than ensemble forecasts of other weather variables. Hence, some form of post-processing is strongly required to improve predictive performance. As TCC observations are usually reported on a discrete scale taking just nine different values called oktas, statistical calibration of TCC ensemble forecasts can be considered a classification problem with outputs given by the probabilities of the oktas. This is a classical area where machine learning methods are applied. We investigate the performance of post-processing using multilayer perceptron (MLP) neural networks, gradient boosting machines (GBM) and random forest (RF) methods. Based on the European Centre for Medium-Range Weather Forecasts global TCC ensemble forecasts for 2002-2014 we compare these approaches with the proportional odds logistic regression (POLR) and multiclass logistic regression (MLR) models, as well as the raw TCC ensemble forecasts. We further assess whether improvements in forecast skill can be obtained by incorporating ensemble forecasts of precipitation as additional predictor. Compared to the raw ensemble, all calibration methods result in a significant improvement in forecast skill. RF models provide the smallest increase in predictive performance, while MLP, POLR and GBM approaches perform best. The use of precipitation forecast data leads to further improvements in forecast skill and except for very short lead times the extended MLP model shows the best overall performance.