Abstract:We present a novel and flexible framework for localized tuning of Hamiltonian Monte Carlo samplers by sampling the algorithm's tuning parameters conditionally based on the position and momentum at each step. For adaptively sampling path lengths, we show that randomized Hamiltonian Monte Carlo, the No-U-Turn Sampler, and the Apogee-to-Apogee Path Sampler all fit within this unified framework as special cases. The framework is illustrated with a simple alternative to the No-U-Turn Sampler for locally adapting path lengths.
Abstract:A novel randomized time integrator is suggested for unadjusted Hamiltonian Monte Carlo (uHMC) in place of the usual Verlet integrator; namely, a stratified Monte Carlo (sMC) integrator which involves a minor modification to Verlet, and hence, is easy to implement. For target distributions of the form $\mu(dx) \propto e^{-U(x)} dx$ where $U: \mathbb{R}^d \to \mathbb{R}_{\ge 0}$ is both $K$-strongly convex and $L$-gradient Lipschitz, and initial distributions $\nu$ with finite second moment, coupling proofs reveal that an $\varepsilon$-accurate approximation of the target distribution $\mu$ in $L^2$-Wasserstein distance $\boldsymbol{\mathcal{W}}^2$ can be achieved by the uHMC algorithm with sMC time integration using $O\left((d/K)^{1/3} (L/K)^{5/3} \varepsilon^{-2/3} \log( \boldsymbol{\mathcal{W}}^2(\mu, \nu) / \varepsilon)^+\right)$ gradient evaluations; whereas without additional assumptions the corresponding complexity of the uHMC algorithm with Verlet time integration is in general $O\left((d/K)^{1/2} (L/K)^2 \varepsilon^{-1} \log( \boldsymbol{\mathcal{W}}^2(\mu, \nu) / \varepsilon)^+ \right)$. Duration randomization, which has a similar effect as partial momentum refreshment, is also treated. In this case, without additional assumptions on the target distribution, the complexity of duration-randomized uHMC with sMC time integration improves to $O\left(\max\left((d/K)^{1/4} (L/K)^{3/2} \varepsilon^{-1/2},(d/K)^{1/3} (L/K)^{4/3} \varepsilon^{-2/3} \right) \right)$ up to logarithmic factors. The improvement due to duration randomization turns out to be analogous to that of time integrator randomization.