Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

May 09, 2023

Patrick Emedom-Nnamdi, Abram L. Friesen, Bobak Shahriari, Nando de Freitas, Matt W. Hoffman

Figure 1 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Figure 2 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Figure 3 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Figure 4 for Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Share this with someone who'll enjoy it:

Abstract:Standard approaches to sequential decision-making exploit an agent's ability to continually interact with its environment and improve its control policy. However, due to safety, ethical, and practicality constraints, this type of trial-and-error experimentation is often infeasible in many real-world domains such as healthcare and robotics. Instead, control policies in these domains are typically trained offline from previously logged data or in a growing-batch manner. In this setting a fixed policy is deployed to the environment and used to gather an entire batch of new data before being aggregated with past batches and used to update the policy. This improvement cycle can then be repeated multiple times. While a limited number of such cycles is feasible in real-world domains, the quality and diversity of the resulting data are much lower than in the standard continually-interacting approach. However, data collection in these domains is often performed in conjunction with human experts, who are able to label or annotate the collected data. In this paper, we first explore the trade-offs present in this growing-batch setting, and then investigate how information provided by a teacher (i.e., demonstrations, expert actions, and gradient information) can be leveraged at training time to mitigate the sample complexity and coverage requirements for actor-critic methods. We validate our contributions on tasks from the DeepMind Control Suite.

* Reincarnating Reinforcement Learning Workshop at ICLR 2023

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Paper and Code