Picture for Pierre Menard

Pierre Menard

IMT

Model-free Posterior Sampling via Learning Rate Randomization

Add code
Oct 27, 2023
Figure 1 for Model-free Posterior Sampling via Learning Rate Randomization
Figure 2 for Model-free Posterior Sampling via Learning Rate Randomization
Figure 3 for Model-free Posterior Sampling via Learning Rate Randomization
Figure 4 for Model-free Posterior Sampling via Learning Rate Randomization
Viaarxiv icon

Demonstration-Regularized RL

Add code
Oct 26, 2023
Viaarxiv icon

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

Add code
Apr 06, 2023
Viaarxiv icon

Fast Rates for Maximum Entropy Exploration

Add code
Mar 14, 2023
Figure 1 for Fast Rates for Maximum Entropy Exploration
Figure 2 for Fast Rates for Maximum Entropy Exploration
Figure 3 for Fast Rates for Maximum Entropy Exploration
Figure 4 for Fast Rates for Maximum Entropy Exploration
Viaarxiv icon

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

Add code
Sep 28, 2022
Figure 1 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Figure 2 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Figure 3 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Figure 4 for Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees
Viaarxiv icon

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

Add code
May 16, 2022
Figure 1 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Figure 2 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Figure 3 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Figure 4 for From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses
Viaarxiv icon

UCB Momentum Q-learning: Correcting the bias without forgetting

Add code
Mar 01, 2021
Figure 1 for UCB Momentum Q-learning: Correcting the bias without forgetting
Figure 2 for UCB Momentum Q-learning: Correcting the bias without forgetting
Figure 3 for UCB Momentum Q-learning: Correcting the bias without forgetting
Viaarxiv icon

The Influence of Shape Constraints on the Thresholding Bandit Problem

Add code
Jun 17, 2020
Figure 1 for The Influence of Shape Constraints on the Thresholding Bandit Problem
Figure 2 for The Influence of Shape Constraints on the Thresholding Bandit Problem
Figure 3 for The Influence of Shape Constraints on the Thresholding Bandit Problem
Viaarxiv icon

Thresholding Bandit for Dose-ranging: The Impact of Monotonicity

Add code
Jul 24, 2018
Figure 1 for Thresholding Bandit for Dose-ranging: The Impact of Monotonicity
Figure 2 for Thresholding Bandit for Dose-ranging: The Impact of Monotonicity
Figure 3 for Thresholding Bandit for Dose-ranging: The Impact of Monotonicity
Viaarxiv icon

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

Add code
May 14, 2018
Figure 1 for KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints
Figure 2 for KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints
Viaarxiv icon