Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Feb 07, 2025

Jiamin Xu, Ivan Nazarov, Aditya Rastogi, África Periáñez, Kyra Gan

Figure 1 for From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Figure 2 for From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Figure 3 for From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Figure 4 for From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Share this with someone who'll enjoy it:

Abstract:Online restless bandits extend classic contextual bandits by incorporating state transitions and budget constraints, representing each agent as a Markov Decision Process (MDP). This framework is crucial for finite-horizon strategic resource allocation, optimizing limited costly interventions for long-term benefits. However, learning the underlying MDP for each agent poses a major challenge in finite-horizon settings. To facilitate learning, we reformulate the problem as a scalable budgeted thresholding contextual bandit problem, carefully integrating the state transitions into the reward design and focusing on identifying agents with action benefits exceeding a threshold. We establish the optimality of an oracle greedy solution in a simple two-state setting, and propose an algorithm that achieves minimax optimal constant regret in the online multi-state setting with heterogeneous agents and knowledge of outcomes under no intervention. We numerically show that our algorithm outperforms existing online restless bandit methods, offering significant improvements in finite-horizon performance.

View paper on

Share this with someone who'll enjoy it:

Title:From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

Paper and Code