Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Fully General Online Imitation Learning

Feb 17, 2021

Michael K. Cohen, Marcus Hutter, Neel Nanda

Figure 1 for Fully General Online Imitation Learning

Figure 2 for Fully General Online Imitation Learning

Share this with someone who'll enjoy it:

Abstract:In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been acting the whole time. No existing work provides formal guidance in how this might be accomplished, instead restricting focus to environments that restart, making learning unusually easy, and conveniently limiting the significance of any mistake. We address a fully general setting, in which the (stochastic) environment and demonstrator never reset, not even for training purposes. Our new conservative Bayesian imitation learner underestimates the probabilities of each available action, and queries for more data with the remaining probability. Our main result: if an event would have been unlikely had the demonstrator acted the whole time, that event's likelihood can be bounded above when running the (initially totally ignorant) imitator instead. Meanwhile, queries to the demonstrator rapidly diminish in frequency.

* 13 pages with 8-page appendix

View paper on

Share this with someone who'll enjoy it:

Title:Fully General Online Imitation Learning

Paper and Code