Abstract:Advancements in artificial intelligence (AI) and deep learning have led to neural networks being used to generate lightning-speed answers to complex questions, to paint like Monet, or to write like Proust. Leveraging their computational speed and flexibility, neural networks are also being used to facilitate fast, likelihood-free statistical inference. However, it is not straightforward to use neural networks with data that for various reasons are incomplete, which precludes their use in many applications. A recently proposed approach to remedy this issue inputs an appropriately padded data vector and a vector that encodes the missingness pattern to a neural network. While computationally efficient, this "masking" approach can result in statistically inefficient inferences. Here, we propose an alternative approach that is based on the Monte Carlo expectation-maximization (EM) algorithm. Our EM approach is likelihood-free, substantially faster than the conventional EM algorithm as it does not require numerical optimization at each iteration, and more statistically efficient than the masking approach. This research represents a prototype problem that indicates how improvements could be made in AI by introducing Bayesian statistical thinking. We compare the two approaches to missingness using simulated incomplete data from two models: a spatial Gaussian process model, and a spatial Potts model. The utility of the methodology is shown on Arctic sea-ice data and cryptocurrency data.
Abstract:Simulation-based methods for making statistical inference have evolved dramatically over the past 50 years, keeping pace with technological advancements. The field is undergoing a new revolution as it embraces the representational capacity of neural networks, optimisation libraries, and graphics processing units for learning complex mappings between data and inferential targets. The resulting tools are amortised, in the sense that they allow inference to be made quickly through fast feedforward operations. In this article we review recent progress made in the context of point estimation, approximate Bayesian inference, the automatic construction of summary statistics, and likelihood approximation. The review also covers available software, and includes a simple illustration to showcase the wide array of tools available for amortised inference and the benefits they offer over state-of-the-art Markov chain Monte Carlo methods. The article concludes with an overview of relevant topics and an outlook on future research directions.
Abstract:Neural Bayes estimators are neural networks that approximate Bayes estimators in a fast and likelihood-free manner. They are appealing to use with spatial models and data, where estimation is often a computational bottleneck. However, neural Bayes estimators in spatial applications have, to date, been restricted to data collected over a regular grid. These estimators are also currently dependent on a prescribed set of spatial locations, which means that the neural network needs to be re-trained for new data sets; this renders them impractical in many applications and impedes their widespread adoption. In this work, we employ graph neural networks to tackle the important problem of parameter estimation from data collected over arbitrary spatial locations. In addition to extending neural Bayes estimation to irregular spatial data, our architecture leads to substantial computational benefits, since the estimator can be used with any arrangement or number of locations and independent replicates, thus amortising the cost of training for a given spatial model. We also facilitate fast uncertainty quantification by training an accompanying neural Bayes estimator that approximates a set of marginal posterior quantiles. We illustrate our methodology on Gaussian and max-stable processes. Finally, we showcase our methodology in a global sea-surface temperature application, where we estimate the parameters of a Gaussian process model in 2,161 regions, each containing thousands of irregularly-spaced data points, in just a few minutes with a single graphics processing unit.
Abstract:Inference for spatial extremal dependence models can be computationally burdensome in moderate-to-high dimensions due to their reliance on intractable and/or censored likelihoods. Exploiting recent advances in likelihood-free inference with neural Bayes estimators (that is, neural estimators that target Bayes estimators), we develop a novel approach to construct highly efficient estimators for censored peaks-over-threshold models by encoding censoring information in the neural network architecture. Our new method provides a paradigm shift that challenges traditional censored likelihood-based inference for spatial extremes. Our simulation studies highlight significant gains in both computational and statistical efficiency, relative to competing likelihood-based approaches, when applying our novel estimators for inference of popular extremal dependence models, such as max-stable, $r$-Pareto, and random scale mixture processes. We also illustrate that it is possible to train a single estimator for a general censoring level, obviating the need to retrain when the censoring level is changed. We illustrate the efficacy of our estimators by making fast inference on hundreds-of-thousands of high-dimensional spatial extremal dependence models to assess particulate matter 2.5 microns or less in diameter (PM2.5) concentration over the whole of Saudi Arabia.
Abstract:Neural networks have recently shown promise for likelihood-free inference, providing orders-of-magnitude speed-ups over classical methods. However, current implementations are suboptimal when estimating parameters from independent replicates. In this paper, we use a decision-theoretic framework to argue that permutation-invariant neural networks are ideally placed for constructing Bayes estimators for arbitrary models, provided that simulation from these models is straightforward. We illustrate the potential of these estimators on both conventional spatial models, as well as highly parameterised spatial-extremes models, and show that they considerably outperform neural estimators that do not account for replication appropriately in their network design. At the same time, they are highly competitive and much faster than traditional likelihood-based estimators. We apply our estimator on a spatial analysis of sea-surface temperature in the Red Sea where, after training, we obtain parameter estimates, and uncertainty quantification of the estimates via bootstrap sampling, from hundreds of spatial fields in a fraction of a second.