Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Peter Glynn

Deep Learning for Computing Convergence Rates of Markov Chains

May 30, 2024

Yanlin Qu, Jose Blanchet, Peter Glynn

Abstract:Convergence rate analysis for general state-space Markov chains is fundamentally important in areas such as Markov chain Monte Carlo and algorithmic analysis (for computing explicit convergence bounds). This problem, however, is notoriously difficult because traditional analytical methods often do not generate practically useful convergence bounds for realistic Markov chains. We propose the Deep Contractive Drift Calculator (DCDC), the first general-purpose sample-based algorithm for bounding the convergence of Markov chains to stationarity in Wasserstein distance. The DCDC has two components. First, inspired by the new convergence analysis framework in (Qu et.al, 2023), we introduce the Contractive Drift Equation (CDE), the solution of which leads to an explicit convergence bound. Second, we develop an efficient neural-network-based CDE solver. Equipped with these two components, DCDC solves the CDE and converts the solution into a convergence bound. We analyze the sample complexity of the algorithm and further demonstrate the effectiveness of the DCDC by generating convergence bounds for realistic Markov chains arising from stochastic processing networks as well as constant step-size stochastic optimization.

Via

Access Paper or Ask Questions

Optimal Sample Complexity for Average Reward Markov Decision Processes

Oct 13, 2023

Shengbo Wang, Jose Blanchet, Peter Glynn

Abstract:We settle the sample complexity of policy learning for the maximization of the long run average reward associated with a uniformly ergodic Markov decision process (MDP), assuming a generative model. In this context, the existing literature provides a sample complexity upper bound of $\widetilde O(|S||A|t_{\text{mix}}^2 \epsilon^{-2})$ and a lower bound of $\Omega(|S||A|t_{\text{mix}} \epsilon^{-2})$. In these expressions, $|S|$ and $|A|$ denote the cardinalities of the state and action spaces respectively, $t_{\text{mix}}$ serves as a uniform upper limit for the total variation mixing times, and $\epsilon$ signifies the error tolerance. Therefore, a notable gap of $t_{\text{mix}}$ still remains to be bridged. Our primary contribution is to establish an estimator for the optimal policy of average reward MDPs with a sample complexity of $\widetilde O(|S||A|t_{\text{mix}}\epsilon^{-2})$, effectively reaching the lower bound in the literature. This is achieved by combining algorithmic ideas in Jin and Sidford (2021) with those of Li et al. (2020).

Via

Access Paper or Ask Questions

Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes

Feb 16, 2023

Shengbo Wang, Jose Blanchet, Peter Glynn

Figure 1 for Optimal Sample Complexity of Reinforcement Learning for Uniformly Ergodic Discounted Markov Decision Processes

Abstract:We consider the optimal sample complexity theory of tabular reinforcement learning (RL) for controlling the infinite horizon discounted reward in a Markov decision process (MDP). Optimal min-max complexity results have been developed for tabular RL in this setting, leading to a sample complexity dependence on $\gamma$ and $\epsilon$ of the form $\tilde \Theta((1-\gamma)^{-3}\epsilon^{-2})$, where $\gamma$ is the discount factor and $\epsilon$ is the tolerance solution error. However, in many applications of interest, the optimal policy (or all policies) will induce mixing. We show that in these settings the optimal min-max complexity is $\tilde \Theta(t_{\text{minorize}}(1-\gamma)^{-2}\epsilon^{-2})$, where $t_{\text{minorize}}$ is a measure of mixing that is within an equivalent factor of the total variation mixing time. Our analysis is based on regeneration-type ideas, that, we believe are of independent interest since they can be used to study related problems for general state space MDPs.

Via

Access Paper or Ask Questions

The Design and Implementation of a Broadly Applicable Algorithm for Optimizing Intra-Day Surgical Scheduling

Mar 14, 2022

Jin Xie, Teng Zhang, Jose Blanchet, Peter Glynn, Matthew Randolph, David Scheinker

Figure 1 for The Design and Implementation of a Broadly Applicable Algorithm for Optimizing Intra-Day Surgical Scheduling

Figure 2 for The Design and Implementation of a Broadly Applicable Algorithm for Optimizing Intra-Day Surgical Scheduling

Figure 3 for The Design and Implementation of a Broadly Applicable Algorithm for Optimizing Intra-Day Surgical Scheduling

Figure 4 for The Design and Implementation of a Broadly Applicable Algorithm for Optimizing Intra-Day Surgical Scheduling

Abstract:Surgical scheduling optimization is an active area of research. However, few algorithms to optimize surgical scheduling are implemented and see sustained use. An algorithm is more likely to be implemented, if it allows for surgeon autonomy, i.e., requires only limited scheduling centralization, and functions in the limited technical infrastructure of widely used electronic medical records (EMRs). In order for an algorithm to see sustained use, it must be compatible with changes to hospital capacity, patient volumes, and scheduling practices. To meet these objectives, we developed the BEDS (better elective day of surgery) algorithm, a greedy heuristic for smoothing unit-specific surgical admissions across days. We implemented BEDS in the EMR of a large pediatric academic medical center. The use of BEDS was associated with a reduction in the variability in the number of admissions. BEDS is freely available as a dashboard in Tableau, a commercial software used by numerous hospitals. BEDS is readily implementable with the limited tools available to most hospitals, does not require reductions to surgeon autonomy or centralized scheduling, and is compatible with changes to hospital capacity or patient volumes. We present a general algorithmic framework from which BEDS is derived based on a particular choice of objectives and constraints. We argue that algorithms generated by this framework retain many of the desirable characteristics of BEDS while being compatible with a wide range of objectives and constraints.

Via

Access Paper or Ask Questions

Surgical Scheduling via Optimization and Machine Learning with Long-Tailed Data

Feb 13, 2022

Yuan Shi, Saied Mahdian, Jose Blanchet, Peter Glynn, Andrew Y. Shin, David Scheinker

Figure 1 for Surgical Scheduling via Optimization and Machine Learning with Long-Tailed Data

Figure 2 for Surgical Scheduling via Optimization and Machine Learning with Long-Tailed Data

Figure 3 for Surgical Scheduling via Optimization and Machine Learning with Long-Tailed Data

Figure 4 for Surgical Scheduling via Optimization and Machine Learning with Long-Tailed Data

Abstract:Using data from cardiovascular surgery patients with long and highly variable post-surgical lengths of stay (LOS), we develop a model to reduce recovery unit congestion. We estimate LOS using a variety of machine learning models, schedule procedures with a variety of online optimization models, and estimate performance with simulation. The machine learning models achieved only modest LOS prediction accuracy, despite access to a very rich set of patient characteristics. Compared to the current paper-based system used in the hospital, most optimization models failed to reduce congestion without increasing wait times for surgery. A conservative stochastic optimization with sufficient sampling to capture the long tail of the LOS distribution outperformed the current manual process. These results highlight the perils of using oversimplified distributional models of patient length of stay for scheduling procedures and the importance of using stochastic optimization well-suited to dealing with long-tailed behavior.

Via

Access Paper or Ask Questions

Optimal best arm selection for general distributions

Aug 24, 2019

Shubhada Agrawal, Sandeep Juneja, Peter Glynn

Abstract:Given a finite set of unknown distributions $\textit{or arms}$ that can be sampled from, we consider the problem of identifying the one with the largest mean using a delta-correct algorithm (an adaptive, sequential algorithm that restricts the probability of error to a specified delta) that has minimum sample complexity. Lower bounds for delta-correct algorithms are well known. Further, delta-correct algorithms that match the lower bound asymptotically as delta reduces to zero have also been developed in literature when the arm distributions are restricted to a single parameter exponential family. In this paper, we first observe a negative result that some restrictions are essential as otherwise under a delta-correct algorithm, distributions with unbounded support would require an infinite number of samples in expectation. We then propose a delta-correct algorithm that matches the lower bound as delta reduces to zero under a mild restriction that a known bound on the expectation of a non-negative, increasing convex function (for example, the squared moment) of underlying random variables, exists. We also propose batch processing and identify optimal batch sizes to substantially speed up the proposed algorithm. This best arm selection problem is a well studied classic problem in the simulation community. It has many learning applications including in recommendation systems and in product selection.

* 34 pages

Via

Access Paper or Ask Questions

Optimal Transport Relaxations with Application to Wasserstein GANs

Jun 07, 2019

Saied Mahdian, Jose Blanchet, Peter Glynn

Figure 1 for Optimal Transport Relaxations with Application to Wasserstein GANs

Figure 2 for Optimal Transport Relaxations with Application to Wasserstein GANs

Figure 3 for Optimal Transport Relaxations with Application to Wasserstein GANs

Figure 4 for Optimal Transport Relaxations with Application to Wasserstein GANs

Abstract:We propose a family of relaxations of the optimal transport problem which regularize the problem by introducing an additional minimization step over a small region around one of the underlying transporting measures. The type of regularization that we obtain is related to smoothing techniques studied in the optimization literature. When using our approach to estimate optimal transport costs based on empirical measures, we obtain statistical learning bounds which are useful to guide the amount of regularization, while maintaining good generalization properties. To illustrate the computational advantages of our regularization approach, we apply our method to training Wasserstein GANs. We obtain running time improvements, relative to current benchmarks, with no deterioration in testing performance (via FID). The running time improvement occurs because our new optimality-based threshold criterion reduces the number of expensive iterates of the generating networks, while increasing the number of actor-critic iterations.

Via

Access Paper or Ask Questions

Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

Jan 30, 2019

Casey Chu, Jose Blanchet, Peter Glynn

Figure 1 for Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

Figure 2 for Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning

Abstract:The goal of this paper is to provide a unifying view of a wide range of problems of interest in machine learning by framing them as the minimization of functionals defined on the space of probability measures. In particular, we show that generative adversarial networks, variational inference, and actor-critic methods in reinforcement learning can all be seen through the lens of our framework. We then discuss a generic optimization algorithm for our formulation, called probability functional descent (PFD), and show how this algorithm recovers existing methods developed independently in the settings mentioned earlier.

Via

Access Paper or Ask Questions

Selecting the best system and multi-armed bandits

Sep 10, 2018

Peter Glynn, Sandeep Juneja

Figure 1 for Selecting the best system and multi-armed bandits

Figure 2 for Selecting the best system and multi-armed bandits

Figure 3 for Selecting the best system and multi-armed bandits

Abstract:Consider the problem of finding a population or a probability distribution amongst many with the largest mean when these means are unknown but population samples can be simulated or otherwise generated. Typically, by selecting largest sample mean population, it can be shown that false selection probability decays at an exponential rate. Lately, researchers have sought algorithms that guarantee that this probability is restricted to a small $\delta$ in order $\log(1/\delta)$ computational time by estimating the associated large deviations rate function via simulation. We show that such guarantees are misleading when populations have unbounded support even when these may be light-tailed. Specifically, we show that any policy that identifies the correct population with probability at least $1-\delta$ for each problem instance requires infinite number of samples in expectation in making such a determination in any problem instance. This suggests that some restrictions are essential on populations to devise $O(\log(1/\delta))$ algorithms with $1 - \delta$ correctness guarantees. We note that under restriction on population moments, such methods are easily designed, and that sequential methods from stochastic multi-armed bandit literature can be adapted to devise such algorithms.

Via

Access Paper or Ask Questions

An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets

Sep 09, 2018

Mansur Arief, Peter Glynn, Ding Zhao

Figure 1 for An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets

Figure 2 for An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets

Figure 3 for An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets

Figure 4 for An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public Streets

Abstract:Various automobile and mobility companies, for instance Ford, Uber and Waymo, are currently testing their pre-produced autonomous vehicle (AV) fleets on the public roads. However, due to rareness of the safety-critical cases and, effectively, unlimited number of possible traffic scenarios, these on-road testing efforts have been acknowledged as tedious, costly, and risky. In this study, we propose Accelerated De- ployment framework to safely and efficiently estimate the AVs performance on public streets. We showed that by appropriately addressing the gradual accuracy improvement and adaptively selecting meaningful and safe environment under which the AV is deployed, the proposed framework yield to highly accurate estimation with much faster evaluation time, and more importantly, lower deployment risk. Our findings provide an answer to the currently heated and active discussions on how to properly test AV performance on public roads so as to achieve safe, efficient, and statistically-reliable testing framework for AV technologies.

Via

Access Paper or Ask Questions