Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Self-Correcting Models for Model-Based Reinforcement Learning

Jul 26, 2017

Erik Talvitie

Figure 1 for Self-Correcting Models for Model-Based Reinforcement Learning

Figure 2 for Self-Correcting Models for Model-Based Reinforcement Learning

Figure 3 for Self-Correcting Models for Model-Based Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:When an agent cannot represent a perfectly accurate model of its environment's dynamics, model-based reinforcement learning (MBRL) can fail catastrophically. Planning involves composing the predictions of the model; when flawed predictions are composed, even minor errors can compound and render the model useless for planning. Hallucinated Replay (Talvitie 2014) trains the model to "correct" itself when it produces errors, substantially improving MBRL with flawed models. This paper theoretically analyzes this approach, illuminates settings in which it is likely to be effective or ineffective, and presents a novel error bound, showing that a model's ability to self-correct is more tightly related to MBRL performance than one-step prediction error. These results inspire an MBRL algorithm for deterministic MDPs with performance guarantees that are robust to model class limitations.

* Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2597-2603 (2017) * Original paper appeared in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. This version incorporates the appendix into document (rather than as supplementary material), corrects a minor error in Lemma 1, and fixes some type-os

View paper on

Share this with someone who'll enjoy it:

Title:Self-Correcting Models for Model-Based Reinforcement Learning

Paper and Code