Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:IL-flOw: Imitation Learning from Observation using Normalizing Flows

May 19, 2022

Wei-Di Chang, Juan Camilo Gamboa Higuera, Scott Fujimoto, David Meger, Gregory Dudek

Figure 1 for IL-flOw: Imitation Learning from Observation using Normalizing Flows

Figure 2 for IL-flOw: Imitation Learning from Observation using Normalizing Flows

Figure 3 for IL-flOw: Imitation Learning from Observation using Normalizing Flows

Figure 4 for IL-flOw: Imitation Learning from Observation using Normalizing Flows

Share this with someone who'll enjoy it:

Abstract:We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only. Our approach decouples reward modelling from policy learning, unlike state-of-the-art adversarial methods which require updating the reward model during policy search and are known to be unstable and difficult to optimize. Our method, IL-flOw, recovers the expert policy by modelling state-state transitions, by generating rewards using deep density estimators trained on the demonstration trajectories, avoiding the instability issues of adversarial methods. We demonstrate that using the state transition log-probability density as a reward signal for forward reinforcement learning translates to matching the trajectory distribution of the expert demonstrations, and experimentally show good recovery of the true reward signal as well as state of the art results for imitation from observation on locomotion and robotic continuous control tasks.

* Presented at the 4th Robot Learning Workshop at NeurIPS 2021

View paper on

Share this with someone who'll enjoy it:

Title:IL-flOw: Imitation Learning from Observation using Normalizing Flows

Paper and Code