Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Simon Maskell

Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles

May 16, 2025

Andrew Millard, Zheng Zhao, Joshua Murphy, Simon Maskell

Abstract:Sequential Monte Carlo (SMC) methods offer a principled approach to Bayesian uncertainty quantification but are traditionally limited by the need for full-batch gradient evaluations. We introduce a scalable variant by incorporating Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) proposals into SMC, enabling efficient mini-batch based sampling. Our resulting SMCSGHMC algorithm outperforms standard stochastic gradient descent (SGD) and deep ensembles across image classification, out-of-distribution (OOD) detection, and transfer learning tasks. We further show that SMCSGHMC mitigates overfitting and improves calibration, providing a flexible, scalable pathway for converting pretrained neural networks into well-calibrated Bayesian models.

Via

Access Paper or Ask Questions

Efficient MCMC Sampling with Expensive-to-Compute and Irregular Likelihoods

May 15, 2025

Conor Rosato, Harvinder Lehal, Simon Maskell, Lee Devlin, Malcolm Strens

Abstract:Bayesian inference with Markov Chain Monte Carlo (MCMC) is challenging when the likelihood function is irregular and expensive to compute. We explore several sampling algorithms that make use of subset evaluations to reduce computational overhead. We adapt the subset samplers for this setting where gradient information is not available or is unreliable. To achieve this, we introduce data-driven proxies in place of Taylor expansions and define a novel computation-cost aware adaptive controller. We undertake an extensive evaluation for a challenging disease modelling task and a configurable task with similar irregularity in the likelihood surface. We find our improved version of Hierarchical Importance with Nested Training Samples (HINTS), with adaptive proposals and a data-driven proxy, obtains the best sampling error in a fixed computational budget. We conclude that subset evaluations can provide cheap and naturally-tempered exploration, while a data-driven proxy can pre-screen proposals successfully in explored regions of the state space. These two elements combine through hierarchical delayed acceptance to achieve efficient, exact sampling.

* 45 pages

Via

Access Paper or Ask Questions

Utilising Gradient-Based Proposals Within Sequential Monte Carlo Samplers for Training of Partial Bayesian Neural Networks

May 01, 2025

Andrew Millard, Joshua Murphy, Simon Maskell, Zheng Zhao

Abstract:Partial Bayesian neural networks (pBNNs) have been shown to perform competitively with fully Bayesian neural networks while only having a subset of the parameters be stochastic. Using sequential Monte Carlo (SMC) samplers as the inference method for pBNNs gives a non-parametric probabilistic estimation of the stochastic parameters, and has shown improved performance over parametric methods. In this paper we introduce a new SMC-based training method for pBNNs by utilising a guided proposal and incorporating gradient-based Markov kernels, which gives us better scalability on high dimensional problems. We show that our new method outperforms the state-of-the-art in terms of predictive performance and optimal loss. We also show that pBNNs scale well with larger batch sizes, resulting in significantly reduced training times and often better performance.

Via

Access Paper or Ask Questions

Poisson multi-Bernoulli mixture filter for trajectory measurements

Apr 11, 2025

Marco Fontana, Ángel F. García-Fernández, Simon Maskell

Abstract:This paper presents a Poisson multi-Bernoulli mixture (PMBM) filter for multi-target filtering based on sensor measurements that are sets of trajectories in the last two-time step window. The proposed filter, the trajectory measurement PMBM (TM-PMBM) filter, propagates a PMBM density on the set of target states. In prediction, the filter obtains the PMBM density on the set of trajectories over the last two time steps. This density is then updated with the set of trajectory measurements. After the update step, the PMBM posterior on the set of two-step trajectories is marginalised to obtain a PMBM density on the set of target states. The filter provides a closed-form solution for multi-target filtering based on sets of trajectory measurements, estimating the set of target states at the end of each time window. Additionally, the paper proposes computationally lighter alternatives to the TM-PMBM filter by deriving a Poisson multi-Bernoulli (PMB) density through Kullback-Leibler divergence minimisation in an augmented space with auxiliary variables. The performance of the proposed filters are evaluated in a simulation study.

* 16 pages, 7 figures, journal paper

Via

Access Paper or Ask Questions

Incorporating the ChEES Criterion into Sequential Monte Carlo Samplers

Apr 03, 2025

Andrew Millard, Joshua Murphy, Daniel Frisch, Simon Maskell

Abstract:Markov chain Monte Carlo (MCMC) methods are a powerful but computationally expensive way of performing non-parametric Bayesian inference. MCMC proposals which utilise gradients, such as Hamiltonian Monte Carlo (HMC), can better explore the parameter space of interest if the additional hyper-parameters are chosen well. The No-U-Turn Sampler (NUTS) is a variant of HMC which is extremely effective at selecting these hyper-parameters but is slow to run and is not suited to GPU architectures. An alternative to NUTS, Change in the Estimator of the Expected Square HMC (ChEES-HMC) was shown not only to run faster than NUTS on GPU but also sample from posteriors more efficiently. Sequential Monte Carlo (SMC) samplers are another sampling method which instead output weighted samples from the posterior. They are very amenable to parallelisation and therefore being run on GPUs while having additional flexibility in their choice of proposal over MCMC. We incorporate (ChEEs-HMC) as a proposal into SMC samplers and demonstrate competitive but faster performance than NUTS on a number of tasks.

* 16 pages, 9 figures

Via

Access Paper or Ask Questions

An Entropic Metric for Measuring Calibration of Machine Learning Models

Feb 20, 2025

Daniel James Sumler, Lee Devlin, Simon Maskell, Richard O. Lane

Abstract:Understanding the confidence with which a machine learning model classifies an input datum is an important, and perhaps under-investigated, concept. In this paper, we propose a new calibration metric, the Entropic Calibration Difference (ECD). Based on existing research in the field of state estimation, specifically target tracking (TT), we show how ECD may be applied to binary classification machine learning models. We describe the relative importance of under- and over-confidence and how they are not conflated in the TT literature. Indeed, our metric distinguishes under- from over-confidence. We consider this important given that algorithms that are under-confident are likely to be 'safer' than algorithms that are over-confident, albeit at the expense of also being over-cautious and so statistically inefficient. We demonstrate how this new metric performs on real and simulated data and compare with other metrics for machine learning model probability calibration, including the Expected Calibration Error (ECE) and its signed counterpart, the Expected Signed Calibration Error (ESCE).

Via

Access Paper or Ask Questions

Fully Bayesian Wideband Direction-of-Arrival Estimation and Detection via RJMCMC

Dec 12, 2024

Kyurae Kim, Philip T. Clemson, James P. Reilly, Jason F. Ralph, Simon Maskell

Figure 1 for Fully Bayesian Wideband Direction-of-Arrival Estimation and Detection via RJMCMC

Figure 2 for Fully Bayesian Wideband Direction-of-Arrival Estimation and Detection via RJMCMC

Figure 3 for Fully Bayesian Wideband Direction-of-Arrival Estimation and Detection via RJMCMC

Figure 4 for Fully Bayesian Wideband Direction-of-Arrival Estimation and Detection via RJMCMC

Abstract:We propose a fully Bayesian approach to wideband, or broadband, direction-of-arrival (DoA) estimation and signal detection. Unlike previous works in wideband DoA estimation and detection, where the signals were modeled in the time-frequency domain, we directly model the time-domain representation and treat the non-causal part of the source signal as latent variables. Furthermore, our Bayesian model allows for closed-form marginalization of the latent source signals by leveraging conjugacy. To further speed up computation, we exploit the sparse ``stripe matrix structure'' of the considered system, which stems from the circulant matrix representation of linear time-invariant (LTI) systems. This drastically reduces the time complexity of computing the likelihood from $\mathcal{O}(N^3 k^3)$ to $\mathcal{O}(N k^3)$, where $N$ is the number of samples received by the array and $k$ is the number of sources. These computational improvements allow for efficient posterior inference through reversible jump Markov chain Monte Carlo (RJMCMC). We use the non-reversible extension of RJMCMC (NRJMCMC), which often achieves lower autocorrelation and faster convergence than the conventional reversible variant. Detection, estimation, and reconstruction of the latent source signals can then all be performed in a fully Bayesian manner through the samples drawn using NRJMCMC. We evaluate the detection performance of the procedure by comparing against generalized likelihood ratio testing (GLRT) and information criteria.

Via

Access Paper or Ask Questions

Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

Jul 24, 2024

Conor Rosato, Joshua Murphy, Alessandro Varsi, Paul Horridge, Simon Maskell

Figure 1 for Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

Figure 2 for Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

Figure 3 for Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

Figure 4 for Enhanced SMC$^2$: Leveraging Gradient Information from Differentiable Particle Filters Within Langevin Proposals

Abstract:Sequential Monte Carlo Squared (SMC$^2$) is a Bayesian method which can infer the states and parameters of non-linear, non-Gaussian state-space models. The standard random-walk proposal in SMC$^2$ faces challenges, particularly with high-dimensional parameter spaces. This study outlines a novel approach by harnessing first-order gradients derived from a Common Random Numbers - Particle Filter (CRN-PF) using PyTorch. The resulting gradients can be leveraged within a Langevin proposal without accept/reject. Including Langevin dynamics within the proposal can result in a higher effective sample size and more accurate parameter estimates when compared with the random-walk. The resulting algorithm is parallelized on distributed memory using Message Passing Interface (MPI) and runs in $\mathcal{O}(\log_2N)$ time complexity. Utilizing 64 computational cores we obtain a 51x speed-up when compared to a single core. A GitHub link is given which provides access to the code.

* 8 pages, 3 images. Accepted to 2024 IEEE International Conference on Multisensor Fusion and Integration (MFI 2024). https://mfi2024.org/. arXiv admin note: text overlap with arXiv:2311.12973

Via

Access Paper or Ask Questions

Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation

Aug 14, 2023

Marcel Hernandez, Angel Garcia-Fernandez, Simon Maskell

Figure 1 for Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation

Figure 2 for Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation

Figure 3 for Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation

Figure 4 for Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation

Abstract:This paper is concerned with sensor management for target search and track using the generalised optimal subpattern assignment (GOSPA) metric. Utilising the GOSPA metric to predict future system performance is computationally challenging, because of the need to account for uncertainties within the scenario, notably the number of targets, the locations of targets, and the measurements generated by the targets subsequent to performing sensing actions. In this paper, efficient sample-based techniques are developed to calculate the predicted mean square GOSPA metric. These techniques allow for missed detections and false alarms, and thereby enable the metric to be exploited in scenarios more complex than those previously considered. Furthermore, the GOSPA methodology is extended to perform non-myopic (i.e. multi-step) sensor management via the development of a Bellman-type recursion that optimises a conditional GOSPA-based metric. Simulations for scenarios with missed detections, false alarms, and planning horizons of up to three time steps demonstrate the approach, in particular showing that optimal plans align with an intuitive understanding of how taking into account the opportunity to make future observations should influence the current action. It is concluded that the GOSPA-based, non-myopic search and track algorithm offers a powerful mechanism for sensor management.

* The paper has been submitted for publication in IEEE Transactions on Aerospace and Electronic Systems and is currently in review

Via

Access Paper or Ask Questions

Bayesian Decision Trees Inspired from Evolutionary Algorithms

May 30, 2023

Efthyvoulos Drousiotis, Alexander M. Phillips, Paul G. Spirakis, Simon Maskell

Abstract:Bayesian Decision Trees (DTs) are generally considered a more advanced and accurate model than a regular Decision Tree (DT) because they can handle complex and uncertain data. Existing work on Bayesian DTs uses Markov Chain Monte Carlo (MCMC) with an accept-reject mechanism and sample using naive proposals to proceed to the next iteration, which can be slow because of the burn-in time needed. We can reduce the burn-in period by proposing a more sophisticated way of sampling or by designing a different numerical Bayesian approach. In this paper, we propose a replacement of the MCMC with an inherently parallel algorithm, the Sequential Monte Carlo (SMC), and a more effective sampling strategy inspired by the Evolutionary Algorithms (EA). Experiments show that SMC combined with the EA can produce more accurate results compared to MCMC in 100 times fewer iterations.

* arXiv admin note: text overlap with arXiv:2301.09090

Via

Access Paper or Ask Questions