Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Francisco M. Garcia

Learning Reusable Options for Multi-Task Reinforcement Learning

Jan 06, 2020

Francisco M. Garcia, Chris Nota, Philip S. Thomas

Figure 1 for Learning Reusable Options for Multi-Task Reinforcement Learning

Figure 2 for Learning Reusable Options for Multi-Task Reinforcement Learning

Figure 3 for Learning Reusable Options for Multi-Task Reinforcement Learning

Figure 4 for Learning Reusable Options for Multi-Task Reinforcement Learning

Abstract:Reinforcement learning (RL) has become an increasingly active area of research in recent years. Although there are many algorithms that allow an agent to solve tasks efficiently, they often ignore the possibility that prior experience related to the task at hand might be available. For many practical applications, it might be unfeasible for an agent to learn how to solve a task from scratch, given that it is generally a computationally expensive process; however, prior experience could be leveraged to make these problems tractable in practice. In this paper, we propose a framework for exploiting existing experience by learning reusable options. We show that after an agent learns policies for solving a small number of problems, we are able to use the trajectories generated from those policies to learn reusable options that allow an agent to quickly learn how to solve novel and related problems.

* 15 pages, 7 figures, pre-print

Via

Access Paper or Ask Questions

A Compression-Inspired Framework for Macro Discovery

Feb 22, 2019

Francisco M. Garcia, Bruno C. da Silva, Philip S. Thomas

Figure 1 for A Compression-Inspired Framework for Macro Discovery

Figure 2 for A Compression-Inspired Framework for Macro Discovery

Figure 3 for A Compression-Inspired Framework for Macro Discovery

Figure 4 for A Compression-Inspired Framework for Macro Discovery

Abstract:In this paper we consider the problem of how a reinforcement learning agent tasked with solving a set of related Markov decision processes can use knowledge acquired early in its lifetime to improve its ability to more rapidly solve novel, but related, tasks. One way of exploiting this experience is by identifying recurrent patterns in trajectories obtained from well-performing policies. We propose a three-step framework in which an agent 1) generates a set of candidate open-loop macros by compressing trajectories drawn from near-optimal policies; 2) evaluates the value of each macro; and 3) selects a maximally diverse subset of macros that spans the space of policies typically required for solving the set of related tasks. Our experiments show that extending the original primitive action-set of the agent with the identified macros allows it to more rapidly learn an optimal policy in unseen, but similar MDPs.

* Accepted as Extended Abstract, AAMAS, 2019

Via

Access Paper or Ask Questions

A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Feb 03, 2019

Francisco M. Garcia, Philip S. Thomas

Figure 1 for A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Figure 2 for A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Figure 3 for A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Abstract:In this paper we consider the problem of how a reinforcement learning agent that is tasked with solving a sequence of reinforcement learning problems (a sequence of Markov decision processes) can use knowledge acquired early in its lifetime to improve its ability to solve new problems. We argue that previous experience with similar problems can provide an agent with information about how it should explore when facing a new but related problem. We show that the search for an optimal exploration strategy can be formulated as a reinforcement learning problem itself and demonstrate that such strategy can leverage patterns found in the structure of related problems. We conclude with experiments that show the benefits of optimizing an exploration strategy using our proposed approach.

* Accepted as Extended Abstract, AAMAS, 2019

Via

Access Paper or Ask Questions