Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:The Gambler's Problem and Beyond

Dec 31, 2019

Baoxiang Wang, Shuai Li, Jiajin Li, Siu On Chan

Figure 1 for The Gambler's Problem and Beyond

Figure 2 for The Gambler's Problem and Beyond

Share this with someone who'll enjoy it:

Abstract:We analyze the Gambler's problem, a simple reinforcement learning problem where the gambler has the chance to double or lose their bets until the target is reached. This is an early example introduced in the reinforcement learning textbook by Sutton and Barto (2018), where they mention an interesting pattern of the optimal value function with high-frequency components and repeating non-smooth points. It is however without further investigation. We provide the exact formula for the optimal value function for both the discrete and the continuous cases. Though simple as it might seem, the value function is pathological: fractal, self-similar, derivative taking either zero or infinity, not smooth on any interval, and not written as elementary functions. It is in fact one of the generalized Cantor functions, where it holds a complexity that has been uncharted thus far. Our analyses could lead insights into improving value function approximation, gradient-based algorithms, and Q-learning, in real applications and implementations.

* International Conference on Learning Representations (ICLR) 2020

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:The Gambler's Problem and Beyond

Paper and Code