Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Online Regret Bounds for Undiscounted Continuous Reinforcement Learning

Feb 11, 2013

Ronald Ortner, Daniil Ryabko

Share this with someone who'll enjoy it:

Abstract:We derive sublinear regret bounds for undiscounted reinforcement learning in continuous state space. The proposed algorithm combines state aggregation with the use of upper confidence bounds for implementing optimism in the face of uncertainty. Beside the existence of an optimal policy which satisfies the Poisson equation, the only assumptions made are Holder continuity of rewards and transition probabilities.

* in proceedings of NIPS 2012, pp. 1772--1780

View paper on

Share this with someone who'll enjoy it:

Title:Online Regret Bounds for Undiscounted Continuous Reinforcement Learning

Paper and Code