Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mengbing Li

Doubly Inhomogeneous Reinforcement Learning

Nov 12, 2022

Liyuan Hu, Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz

Figure 1 for Doubly Inhomogeneous Reinforcement Learning

Figure 2 for Doubly Inhomogeneous Reinforcement Learning

Figure 3 for Doubly Inhomogeneous Reinforcement Learning

Figure 4 for Doubly Inhomogeneous Reinforcement Learning

Abstract:This paper studies reinforcement learning (RL) in doubly inhomogeneous environments under temporal non-stationarity and subject heterogeneity. In a number of applications, it is commonplace to encounter datasets generated by system dynamics that may change over time and population, challenging high-quality sequential decision making. Nonetheless, most existing RL solutions require either temporal stationarity or subject homogeneity, which would result in sub-optimal policies if both assumptions were violated. To address both challenges simultaneously, we propose an original algorithm to determine the ``best data chunks" that display similar dynamics over time and across individuals for policy learning, which alternates between most recent change point detection and cluster identification. Our method is general, and works with a wide range of clustering and change point detection algorithms. It is multiply robust in the sense that it takes multiple initial estimators as input and only requires one of them to be consistent. Moreover, by borrowing information over time and population, it allows us to detect weaker signals and has better convergence properties when compared to applying the clustering algorithm per time or the change point detection algorithm per subject. Empirically, we demonstrate the usefulness of our method through extensive simulations and a real data application.

Via

Access Paper or Ask Questions

Reinforcement Learning in Possibly Nonstationary Environments

Mar 03, 2022

Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz

Figure 1 for Reinforcement Learning in Possibly Nonstationary Environments

Figure 2 for Reinforcement Learning in Possibly Nonstationary Environments

Figure 3 for Reinforcement Learning in Possibly Nonstationary Environments

Figure 4 for Reinforcement Learning in Possibly Nonstationary Environments

Abstract:We consider reinforcement learning (RL) methods in offline nonstationary environments. Many existing RL algorithms in the literature rely on the stationarity assumption that requires the system transition and the reward function to be constant over time. However, the stationarity assumption is restrictive in practice and is likely to be violated in a number of applications, including traffic signal control, robotics and mobile health. In this paper, we develop a consistent procedure to test the nonstationarity of the optimal policy based on pre-collected historical data, without additional online data collection. Based on the proposed test, we further develop a sequential change point detection method that can be naturally coupled with existing state-of-the-art RL methods for policy optimisation in nonstationary environments. The usefulness of our method is illustrated by theoretical results, simulation studies, and a real data example from the 2018 Intern Health Study. A Python implementation of the proposed procedure is available at https://github.com/limengbinggz/CUSUM-RL

Via

Access Paper or Ask Questions