Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Model-Based Reinforcement Learning via Meta-Policy Optimization

Sep 14, 2018

Ignasi Clavera, Jonas Rothfuss, John Schulman, Yasuhiro Fujita, Tamim Asfour, Pieter Abbeel

Figure 1 for Model-Based Reinforcement Learning via Meta-Policy Optimization

Figure 2 for Model-Based Reinforcement Learning via Meta-Policy Optimization

Figure 3 for Model-Based Reinforcement Learning via Meta-Policy Optimization

Figure 4 for Model-Based Reinforcement Learning via Meta-Policy Optimization

Share this with someone who'll enjoy it:

Abstract:Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models. Using an ensemble of learned dynamic models, MB-MPO meta-learns a policy that can quickly adapt to any model in the ensemble with one policy gradient step. This steers the meta-policy towards internalizing consistent dynamics predictions among the ensemble while shifting the burden of behaving optimally w.r.t. the model discrepancies towards the adaptation step. Our experiments show that MB-MPO is more robust to model imperfections than previous model-based approaches. Finally, we demonstrate that our approach is able to match the asymptotic performance of model-free methods while requiring significantly less experience.

* First 2 authors contributed equally. Accepted for Conference on Robot Learning (CoRL)

View paper on

Share this with someone who'll enjoy it:

Title:Model-Based Reinforcement Learning via Meta-Policy Optimization

Paper and Code