Picture for Denis Denisov

Denis Denisov

Regret Analysis of a Markov Policy Gradient Algorithm for Multi-arm Bandits

Add code
Aug 05, 2020
Figure 1 for Regret Analysis of a Markov Policy Gradient Algorithm for Multi-arm Bandits
Viaarxiv icon