Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Austin J. Stromme

Minimum intrinsic dimension scaling for entropic optimal transport

Jun 06, 2023

Austin J. Stromme

Abstract:Motivated by the manifold hypothesis, which states that data with a high extrinsic dimension may yet have a low intrinsic dimension, we develop refined statistical bounds for entropic optimal transport that are sensitive to the intrinsic dimension of the data. Our bounds involve a robust notion of intrinsic dimension, measured at only a single distance scale depending on the regularization parameter, and show that it is only the minimum of these single-scale intrinsic dimensions which governs the rate of convergence. We call this the Minimum Intrinsic Dimension scaling (MID scaling) phenomenon, and establish MID scaling with no assumptions on the data distributions so long as the cost is bounded and Lipschitz, and for various entropic optimal transport quantities beyond just values, with stronger analogs when one distribution is supported on a manifold. Our results significantly advance the theoretical state of the art by showing that MID scaling is a generic phenomenon, and provide the first rigorous interpretation of the statistical effect of entropic regularization as a distance scale.

* 53 pages

Via

Access Paper or Ask Questions

Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Jun 16, 2021

Jason M. Altschuler, Sinho Chewi, Patrik Gerber, Austin J. Stromme

Figure 1 for Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Figure 2 for Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Figure 3 for Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Figure 4 for Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Abstract:We study first-order optimization algorithms for computing the barycenter of Gaussian distributions with respect to the optimal transport metric. Although the objective is geodesically non-convex, Riemannian GD empirically converges rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP solvers. This stands in stark contrast to the best-known theoretical results for Riemannian GD, which depend exponentially on the dimension. In this work, we prove new geodesic convexity results which provide stronger control of the iterates, yielding a dimension-free convergence rate. Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median, providing the first convergence guarantees for Riemannian GD for these problems.

* 48 pages, 8 figures

Via

Access Paper or Ask Questions

Exponential ergodicity of mirror-Langevin diffusions

Jun 02, 2020

Sinho Chewi, Thibaut Le Gouic, Chen Lu, Tyler Maunu, Philippe Rigollet, Austin J. Stromme

Figure 1 for Exponential ergodicity of mirror-Langevin diffusions

Figure 2 for Exponential ergodicity of mirror-Langevin diffusions

Figure 3 for Exponential ergodicity of mirror-Langevin diffusions

Figure 4 for Exponential ergodicity of mirror-Langevin diffusions

Abstract:Motivated by the problem of sampling from ill-conditioned log-concave distributions, we give a clean non-asymptotic convergence analysis of mirror-Langevin diffusions as introduced in Zhang et al. (2020). As a special case of this framework, we propose a class of diffusions called Newton-Langevin diffusions and prove that they converge to stationarity exponentially fast with a rate which not only is dimension-free, but also has no dependence on the target distribution. We give an application of this result to the problem of sampling from the uniform distribution on a convex body using a strategy inspired by interior-point methods. Our general approach follows the recent trend of linking sampling and optimization and highlights the role of the chi-squared divergence. In particular, it yields new results on the convergence of the vanilla Langevin diffusion in Wasserstein distance.

* 27 pages, 10 figures

Via

Access Paper or Ask Questions