Abstract:Accurately representing surface weather at the sub-kilometer scale is crucial for optimal decision-making in a wide range of applications. This motivates the use of statistical techniques to provide accurate and calibrated probabilistic predictions at a lower cost compared to numerical simulations. Wind represents a particularly challenging variable to model due to its high spatial and temporal variability. This paper presents a novel approach that integrates Gaussian processes (GPs) and neural networks to model surface wind gusts, leveraging multiple data sources, including numerical weather prediction (NWP) models, digital elevation models (DEM), and in-situ measurements. Results demonstrate the added value of modeling the multivariate covariance structure of the variable of interest, as opposed to only applying a univariate probabilistic regression approach. Modeling the covariance enables the optimal integration of observed measurements from ground stations, which is shown to reduce the continuous ranked probability score compared to the baseline. Moreover, it allows the direct generation of realistic fields that are also marginally calibrated, aided by scalable techniques such as Random Fourier Features (RFF) and pathwise conditioning. We discuss the effect of different modeling choices, as well as different degrees of approximation, and present our results for a case study.