Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Bridging the Imitation Gap by Adaptive Insubordination

Jul 23, 2020

Luca Weihs, Unnat Jain, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

Figure 1 for Bridging the Imitation Gap by Adaptive Insubordination

Figure 2 for Bridging the Imitation Gap by Adaptive Insubordination

Figure 3 for Bridging the Imitation Gap by Adaptive Insubordination

Figure 4 for Bridging the Imitation Gap by Adaptive Insubordination

Share this with someone who'll enjoy it:

Abstract:Why do agents often obtain better reinforcement learning policies when imitating a worse expert? We show that privileged information used by the expert is marginalized in the learned agent policy, resulting in an "imitation gap." Prior work bridges this gap via a progression from imitation learning to reinforcement learning. While often successful, gradual progression fails for tasks that require frequent switches between exploration and memorization skills. To better address these tasks and alleviate the imitation gap we propose 'Adaptive Insubordination' (ADVISOR), which dynamically reweights imitation and reward-based reinforcement learning losses during training, enabling switching between imitation and exploration. On a suite of challenging tasks, we show that ADVISOR outperforms pure imitation, pure reinforcement learning, as well as sequential combinations of these approaches.

* The first two authors contributed equally

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Bridging the Imitation Gap by Adaptive Insubordination

Paper and Code