Picture for Wang Chi Cheung

Wang Chi Cheung

Leveraging (Biased) Information: Multi-armed Bandits with Offline Data

Add code
May 04, 2024
Viaarxiv icon

Best Arm Identification with Resource Constraints

Add code
Feb 29, 2024
Viaarxiv icon

Non-Stationary Bandits with Knapsack Problems with Advice

Add code
Feb 08, 2023
Viaarxiv icon

On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits

Add code
Oct 16, 2021
Figure 1 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Figure 2 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Figure 3 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Figure 4 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Viaarxiv icon

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

Add code
Oct 16, 2020
Figure 1 for Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Figure 2 for Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Figure 3 for Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Viaarxiv icon

Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism

Add code
Jun 24, 2020
Figure 1 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Figure 2 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Figure 3 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Figure 4 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Viaarxiv icon

Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting

Add code
Jan 24, 2020
Figure 1 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Figure 2 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Figure 3 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Figure 4 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Viaarxiv icon

Reinforcement Learning under Drift

Add code
Jun 07, 2019
Figure 1 for Reinforcement Learning under Drift
Figure 2 for Reinforcement Learning under Drift
Viaarxiv icon

Exploration-Exploitation Trade-off in Reinforcement Learning on Online Markov Decision Processes with Global Concave Rewards

Add code
May 15, 2019
Figure 1 for Exploration-Exploitation Trade-off in Reinforcement Learning on Online Markov Decision Processes with Global Concave Rewards
Viaarxiv icon

Hedging the Drift: Learning to Optimize under Non-Stationarity

Add code
Mar 04, 2019
Figure 1 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Figure 2 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Figure 3 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Figure 4 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Viaarxiv icon