Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lukas Zierahn

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs

May 15, 2023

Dirk van der Hoeven, Lukas Zierahn, Tal Lancewicki, Aviv Rosenberg, Nicoló Cesa-Bianchi

Abstract:We derive a new analysis of Follow The Regularized Leader (FTRL) for online learning with delayed bandit feedback. By separating the cost of delayed feedback from that of bandit feedback, our analysis allows us to obtain new results in three important settings. On the one hand, we derive the first optimal (up to logarithmic factors) regret bounds for combinatorial semi-bandits with delay and adversarial Markov decision processes with delay (and known transition functions). On the other hand, we use our analysis to derive an efficient algorithm for linear bandits with delay achieving near-optimal regret bounds. Our novel regret decomposition shows that FTRL remains stable across multiple rounds under mild assumptions on the Hessian of the regularizer.

Via

Access Paper or Ask Questions

PyChEst: a Python package for the consistent retrospective estimation of distributional changes in piece-wise stationary time series

Dec 20, 2021

Azadeh Khaleghi, Lukas Zierahn

Figure 1 for PyChEst: a Python package for the consistent retrospective estimation of distributional changes in piece-wise stationary time series

Figure 2 for PyChEst: a Python package for the consistent retrospective estimation of distributional changes in piece-wise stationary time series

Figure 3 for PyChEst: a Python package for the consistent retrospective estimation of distributional changes in piece-wise stationary time series

Figure 4 for PyChEst: a Python package for the consistent retrospective estimation of distributional changes in piece-wise stationary time series

Abstract:We introduce PyChEst, a Python package which provides tools for the simultaneous estimation of multiple changepoints in the distribution of piece-wise stationary time series. The nonparametric algorithms implemented are provably consistent in a general framework: when the samples are generated by unknown piece-wise stationary processes. In this setting, samples may have long-range dependencies of arbitrary form and the finite-dimensional marginals of any (unknown) fixed size before and after the changepoints may be the same. The strength of the algorithms included in the package is in their ability to consistently detect the changes without imposing any assumptions beyond stationarity on the underlying process distributions. We illustrate this distinguishing feature by comparing the performance of the package against state-of-the-art models designed for a setting where the samples are independently and identically distributed.

Via

Access Paper or Ask Questions