Picture for Wang Chi Cheung

Wang Chi Cheung

Leveraging (Biased) Information: Multi-armed Bandits with Offline Data

Add code
May 04, 2024
Viaarxiv icon

Best Arm Identification with Resource Constraints

Add code
Feb 29, 2024
Figure 1 for Best Arm Identification with Resource Constraints
Figure 2 for Best Arm Identification with Resource Constraints
Figure 3 for Best Arm Identification with Resource Constraints
Figure 4 for Best Arm Identification with Resource Constraints
Viaarxiv icon

Non-Stationary Bandits with Knapsack Problems with Advice

Add code
Feb 08, 2023
Figure 1 for Non-Stationary Bandits with Knapsack Problems with Advice
Figure 2 for Non-Stationary Bandits with Knapsack Problems with Advice
Figure 3 for Non-Stationary Bandits with Knapsack Problems with Advice
Figure 4 for Non-Stationary Bandits with Knapsack Problems with Advice
Viaarxiv icon

On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits

Add code
Oct 16, 2021
Figure 1 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Figure 2 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Figure 3 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Figure 4 for On the Pareto Frontier of Regret Minimization and Best Arm Identification in Stochastic Bandits
Viaarxiv icon

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

Add code
Oct 16, 2020
Figure 1 for Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Figure 2 for Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Figure 3 for Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions
Viaarxiv icon

Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism

Add code
Jun 24, 2020
Figure 1 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Figure 2 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Figure 3 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Figure 4 for Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism
Viaarxiv icon

Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting

Add code
Jan 24, 2020
Figure 1 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Figure 2 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Figure 3 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Figure 4 for Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Viaarxiv icon

Reinforcement Learning under Drift

Add code
Jun 07, 2019
Figure 1 for Reinforcement Learning under Drift
Figure 2 for Reinforcement Learning under Drift
Viaarxiv icon

Exploration-Exploitation Trade-off in Reinforcement Learning on Online Markov Decision Processes with Global Concave Rewards

Add code
May 15, 2019
Figure 1 for Exploration-Exploitation Trade-off in Reinforcement Learning on Online Markov Decision Processes with Global Concave Rewards
Viaarxiv icon

Hedging the Drift: Learning to Optimize under Non-Stationarity

Add code
Mar 04, 2019
Figure 1 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Figure 2 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Figure 3 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Figure 4 for Hedging the Drift: Learning to Optimize under Non-Stationarity
Viaarxiv icon