Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Oct 22, 2021

Pierre Laforgue, Giulia Clerici, Nicolò Cesa-Bianchi, Ran Gilad-Bachrach

Figure 1 for Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Figure 2 for Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Figure 3 for Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Figure 4 for Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Share this with someone who'll enjoy it:

Abstract:Motivated by the fact that humans like some level of unpredictability or novelty, and might therefore get quickly bored when interacting with a stationary policy, we introduce a novel non-stationary bandit problem, where the expected reward of an arm is fully determined by the time elapsed since the arm last took part in a switch of actions. Our model generalizes previous notions of delay-dependent rewards, and also relaxes most assumptions on the reward function. This enables the modeling of phenomena such as progressive satiation and periodic behaviours. Building upon the Combinatorial Semi-Bandits (CSB) framework, we design an algorithm and prove a bound on its regret with respect to the optimal non-stationary policy (which is NP-hard to compute). Similarly to previous works, our regret analysis is based on defining and solving an appropriate trade-off between approximation and estimation. Preliminary experiments confirm the superiority of our algorithm over both the oracle greedy approach and a vanilla CSB solver.

View paper on

Share this with someone who'll enjoy it:

Title:Break your Bandit Routine with LSD Rewards: a Last Switch Dependent Analysis of Satiation and Seasonality

Paper and Code