Picture for Bruno Gaujal

Bruno Gaujal

POLARIS, LIG

Logarithmic Regret of Exploration in Average Reward Markov Decision Processes

Add code
Feb 10, 2025
Viaarxiv icon

Learning Optimal Admission Control in Partially Observable Queueing Networks

Add code
Aug 04, 2023
Viaarxiv icon

Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space

Add code
Feb 21, 2023
Viaarxiv icon

Decentralized model-free reinforcement learning in stochastic games with average-reward objective

Add code
Jan 13, 2023
Viaarxiv icon

Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?

Add code
Jun 16, 2021
Figure 1 for Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Figure 2 for Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Figure 3 for Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Figure 4 for Reinforcement Learning for Markovian Bandits: Is Posterior Sampling more Scalable than Optimism?
Viaarxiv icon

Penalty-regulated dynamics and robust learning procedures in games

Add code
Apr 06, 2014
Figure 1 for Penalty-regulated dynamics and robust learning procedures in games
Figure 2 for Penalty-regulated dynamics and robust learning procedures in games
Figure 3 for Penalty-regulated dynamics and robust learning procedures in games
Figure 4 for Penalty-regulated dynamics and robust learning procedures in games
Viaarxiv icon

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Add code
May 19, 2011
Viaarxiv icon