Abstract:Reliable uncertainty quantification at unobserved spatial locations, especially in the presence of complex and heterogeneous datasets, remains a core challenge in spatial statistics. Traditional approaches like Kriging rely heavily on assumptions such as normality, which often break down in large-scale, diverse datasets, leading to unreliable prediction intervals. While machine learning methods have emerged as powerful alternatives, they primarily focus on point predictions and provide limited mechanisms for uncertainty quantification. Conformal prediction, a distribution-free framework, offers valid prediction intervals without relying on parametric assumptions. However, existing conformal prediction methods are either not tailored for spatial settings, or existing ones for spatial data have relied on rather restrictive i.i.d. assumptions. In this paper, we propose Localized Spatial Conformal Prediction (LSCP), a conformal prediction method designed specifically for spatial data. LSCP leverages localized quantile regression to construct prediction intervals. Instead of i.i.d. assumptions, our theoretical analysis builds on weaker conditions of stationarity and spatial mixing, which is natural for spatial data, providing finite-sample bounds on the conditional coverage gap and establishing asymptotic guarantees for conditional coverage. We present experiments on both synthetic and real-world datasets to demonstrate that LSCP achieves accurate coverage with significantly tighter and more consistent prediction intervals across the spatial domain compared to existing methods.
Abstract:In recent years, increasingly unpredictable and severe global weather patterns have frequently caused long-lasting power outages. Building resilience, the ability to withstand, adapt to, and recover from major disruptions, has become crucial for the power industry. To enable rapid recovery, accurately predicting future outage numbers is essential. Rather than relying on simple point estimates, we analyze extensive quarter-hourly outage data and develop a graph conformal prediction method that delivers accurate prediction regions for outage numbers across the states for a time period. We demonstrate the effectiveness of this method through extensive numerical experiments in several states affected by extreme weather events that led to widespread outages.
Abstract:Conformal prediction (CP) has been a popular method for uncertainty quantification because it is distribution-free, model-agnostic, and theoretically sound. For forecasting problems in supervised learning, most CP methods focus on building prediction intervals for univariate responses. In this work, we develop a sequential CP method called $\texttt{MultiDimSPCI}$ that builds prediction regions for a multivariate response, especially in the context of multivariate time series, which are not exchangeable. Theoretically, we estimate finite-sample high-probability bounds on the conditional coverage gap. Empirically, we demonstrate that $\texttt{MultiDimSPCI}$ maintains valid coverage on a wide range of multivariate time series while producing smaller prediction regions than CP and non-CP baselines.
Abstract:Modeling and estimation for spatial data are ubiquitous in real life, frequently appearing in weather forecasting, pollution detection, and agriculture. Spatial data analysis often involves processing datasets of enormous scale. In this work, we focus on large-scale internet-quality open datasets from Ookla. We look into estimating mobile (cellular) internet quality at the scale of a state in the United States. In particular, we aim to conduct estimation based on highly {\it imbalanced} data: Most of the samples are concentrated in limited areas, while very few are available in the rest, posing significant challenges to modeling efforts. We propose a new adaptive kernel regression approach that employs self-tuning kernels to alleviate the adverse effects of data imbalance in this problem. Through comparative experimentation on two distinct mobile network measurement datasets, we demonstrate that the proposed self-tuning kernel regression method produces more accurate predictions, with the potential to be applied in other applications.
Abstract:Inverse wave scattering aims at determining the properties of an object using data on how the object scatters incoming waves. In order to collect information, sensors are put in different locations to send and receive waves from each other. The choice of sensor positions and incident wave frequencies determines the reconstruction quality of scatterer properties. This paper introduces reinforcement learning to develop precision imaging that decides sensor positions and wave frequencies adaptive to different scatterers in an intelligent way, thus obtaining a significant improvement in reconstruction quality with limited imaging resources. Extensive numerical results will be provided to demonstrate the superiority of the proposed method over existing methods.