Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nawaf Bou-Rabee

Ballistic Convergence in Hit-and-Run Monte Carlo and a Coordinate-free Randomized Kaczmarz Algorithm

Dec 10, 2024

Nawaf Bou-Rabee, Andreas Eberle, Stefan Oberdörster

Abstract:Hit-and-Run is a coordinate-free Gibbs sampler, yet the quantitative advantages of its coordinate-free property remain largely unexplored beyond empirical studies. In this paper, we prove sharp estimates for the Wasserstein contraction of Hit-and-Run in Gaussian target measures via coupling methods and conclude mixing time bounds. Our results uncover ballistic and superdiffusive convergence rates in certain settings. Furthermore, we extend these insights to a coordinate-free variant of the randomized Kaczmarz algorithm, an iterative method for linear systems, and demonstrate analogous convergence rates. These findings offer new insights into the advantages and limitations of coordinate-free methods for both sampling and optimization.

* 28 pages

Via

Access Paper or Ask Questions

Mixing of the No-U-Turn Sampler and the Geometry of Gaussian Concentration

Oct 09, 2024

Nawaf Bou-Rabee, Stefan Oberdörster

Figure 1 for Mixing of the No-U-Turn Sampler and the Geometry of Gaussian Concentration

Figure 2 for Mixing of the No-U-Turn Sampler and the Geometry of Gaussian Concentration

Figure 3 for Mixing of the No-U-Turn Sampler and the Geometry of Gaussian Concentration

Figure 4 for Mixing of the No-U-Turn Sampler and the Geometry of Gaussian Concentration

Abstract:We prove that the mixing time of the No-U-Turn Sampler (NUTS), when initialized in the concentration region of the canonical Gaussian measure, scales as $d^{1/4}$, up to logarithmic factors, where $d$ is the dimension. This scaling is expected to be sharp. This result is based on a coupling argument that leverages the geometric structure of the target distribution. Specifically, concentration of measure results in a striking uniformity in NUTS' locally adapted transitions, which holds with high probability. This uniformity is formalized by interpreting NUTS as an accept/reject Markov chain, where the mixing properties for the more uniform accept chain are analytically tractable. Additionally, our analysis uncovers a previously unnoticed issue with the path length adaptation procedure of NUTS, specifically related to looping behavior, which we address in detail.

* a list of open problems is included; for companion code, see http://github.com/oberdoerster/Mixing-of-NUTS

Via

Access Paper or Ask Questions

GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo

Apr 23, 2024

Nawaf Bou-Rabee, Bob Carpenter, Milo Marsden

Figure 1 for GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo

Figure 2 for GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo

Figure 3 for GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo

Figure 4 for GIST: Gibbs self-tuning for locally adaptive Hamiltonian Monte Carlo

Abstract:We present a novel and flexible framework for localized tuning of Hamiltonian Monte Carlo samplers by sampling the algorithm's tuning parameters conditionally based on the position and momentum at each step. For adaptively sampling path lengths, we show that randomized Hamiltonian Monte Carlo, the No-U-Turn Sampler, and the Apogee-to-Apogee Path Sampler all fit within this unified framework as special cases. The framework is illustrated with a simple alternative to the No-U-Turn Sampler for locally adapting path lengths.

* for companion code, see https://github.com/bob-carpenter/adaptive-hmc

Via

Access Paper or Ask Questions

Randomized Runge-Kutta-Nyström

Oct 11, 2023

Nawaf Bou-Rabee, Tore Selland Kleppe

Abstract:We present 5/2- and 7/2-order $L^2$-accurate randomized Runge-Kutta-Nystr\"om methods to approximate the Hamiltonian flow underlying various non-reversible Markov chain Monte Carlo chains including unadjusted Hamiltonian Monte Carlo and unadjusted kinetic Langevin chains. Quantitative 5/2-order $L^2$-accuracy upper bounds are provided under gradient and Hessian Lipschitz assumptions on the potential energy function. The superior complexity of the corresponding Markov chains is numerically demonstrated for a selection of `well-behaved', high-dimensional target distributions.

Via

Access Paper or Ask Questions

Unadjusted Hamiltonian MCMC with Stratified Monte Carlo Time Integration

Dec 15, 2022

Nawaf Bou-Rabee, Milo Marsden

Abstract:A novel randomized time integrator is suggested for unadjusted Hamiltonian Monte Carlo (uHMC) in place of the usual Verlet integrator; namely, a stratified Monte Carlo (sMC) integrator which involves a minor modification to Verlet, and hence, is easy to implement. For target distributions of the form $\mu(dx) \propto e^{-U(x)} dx$ where $U: \mathbb{R}^d \to \mathbb{R}_{\ge 0}$ is both $K$-strongly convex and $L$-gradient Lipschitz, and initial distributions $\nu$ with finite second moment, coupling proofs reveal that an $\varepsilon$-accurate approximation of the target distribution $\mu$ in $L^2$-Wasserstein distance $\boldsymbol{\mathcal{W}}^2$ can be achieved by the uHMC algorithm with sMC time integration using $O\left((d/K)^{1/3} (L/K)^{5/3} \varepsilon^{-2/3} \log( \boldsymbol{\mathcal{W}}^2(\mu, \nu) / \varepsilon)^+\right)$ gradient evaluations; whereas without additional assumptions the corresponding complexity of the uHMC algorithm with Verlet time integration is in general $O\left((d/K)^{1/2} (L/K)^2 \varepsilon^{-1} \log( \boldsymbol{\mathcal{W}}^2(\mu, \nu) / \varepsilon)^+ \right)$. Duration randomization, which has a similar effect as partial momentum refreshment, is also treated. In this case, without additional assumptions on the target distribution, the complexity of duration-randomized uHMC with sMC time integration improves to $O\left(\max\left((d/K)^{1/4} (L/K)^{3/2} \varepsilon^{-1/2},(d/K)^{1/3} (L/K)^{4/3} \varepsilon^{-2/3} \right) \right)$ up to logarithmic factors. The improvement due to duration randomization turns out to be analogous to that of time integrator randomization.

* 39 pages, 2 figures

Via

Access Paper or Ask Questions

Mixing Time Guarantees for Unadjusted Hamiltonian Monte Carlo

May 03, 2021

Nawaf Bou-Rabee, Andreas Eberle

Figure 1 for Mixing Time Guarantees for Unadjusted Hamiltonian Monte Carlo

Abstract:We provide quantitative upper bounds on the total variation mixing time of the Markov chain corresponding to the unadjusted Hamiltonian Monte Carlo (uHMC) algorithm. For two general classes of models and fixed time discretization step size $h$, the mixing time is shown to depend only logarithmically on the dimension. Moreover, we provide quantitative upper bounds on the total variation distance between the invariant measure of the uHMC chain and the true target measure. As a consequence, we show that an $\varepsilon$-accurate approximation of the target distribution $\mu$ in total variation distance can be achieved by uHMC for a broad class of models with $O\left(d^{3/4}\varepsilon^{-1/2}\log (d/\varepsilon )\right)$ gradient evaluations, and for mean field models with weak interactions with $O\left(d^{1/2}\varepsilon^{-1/2}\log (d/\varepsilon )\right)$ gradient evaluations. The proofs are based on the construction of successful couplings for uHMC that realize the upper bounds.

* 43 pages

Via

Access Paper or Ask Questions

Couplings for Andersen Dynamics

Sep 29, 2020

Nawaf Bou-Rabee, Andreas Eberle

Figure 1 for Couplings for Andersen Dynamics

Figure 2 for Couplings for Andersen Dynamics

Figure 3 for Couplings for Andersen Dynamics

Figure 4 for Couplings for Andersen Dynamics

Abstract:Andersen dynamics is a standard method for molecular simulations, and a precursor of the Hamiltonian Monte Carlo algorithm used in MCMC inference. The stochastic process corresponding to Andersen dynamics is a PDMP (piecewise deterministic Markov process) that iterates between Hamiltonian flows and velocity randomizations of randomly selected particles. Both from the viewpoint of molecular dynamics and MCMC inference, a basic question is to understand the convergence to equilibrium of this PDMP particularly in high dimension. Here we present couplings to obtain sharp convergence bounds in the Wasserstein sense that do not require global convexity of the underlying potential energy.

* 36 pages, 8 figures

Via

Access Paper or Ask Questions

Coupling and Convergence for Hamiltonian Monte Carlo

May 01, 2018

Nawaf Bou-Rabee, Andreas Eberle, Raphael Zimmer

Figure 1 for Coupling and Convergence for Hamiltonian Monte Carlo

Figure 2 for Coupling and Convergence for Hamiltonian Monte Carlo

Figure 3 for Coupling and Convergence for Hamiltonian Monte Carlo

Abstract:Based on a new coupling approach, we prove that the transition step of the Hamiltonian Monte Carlo algorithm is contractive w.r.t. a carefully designed Kantorovich (L1 Wasserstein) distance. The lower bound for the contraction rate is explicit. Global convexity of the potential is not required, and thus multimodal target distributions are included. Explicit quantitative bounds for the number of steps required to approximate the stationary distribution up to a given error are a direct consequence of contractivity. These bounds show that HMC can overcome diffusive behaviour if the duration of the Hamiltonian dynamics is adjusted appropriately.

* 43 pages, 3 figures

Via

Access Paper or Ask Questions