Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lucas Willems

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Jun 24, 2023

Maxime Chevalier-Boisvert, Bolun Dai, Mark Towers, Rodrigo de Lazcano, Lucas Willems, Salem Lahlou, Suman Pal, Pablo Samuel Castro, Jordan Terry

Figure 1 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Figure 2 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Figure 3 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Figure 4 for Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Abstract:We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for a wide range of research-specific needs. As a result, both have received widescale adoption by the RL community, facilitating research in a wide range of areas. In this paper, we outline the design philosophy, environment details, and their world generation API. We also showcase the additional capabilities brought by the unified API between Minigrid and Miniworld through case studies on transfer learning (for both RL agents and humans) between the different observation spaces. The source code of Minigrid and Miniworld can be found at https://github.com/Farama-Foundation/{Minigrid, Miniworld} along with their documentation at https://{minigrid, miniworld}.farama.org/.

Via

Access Paper or Ask Questions

Mastering Rate based Curriculum Learning

Aug 14, 2020

Lucas Willems, Salem Lahlou, Yoshua Bengio

Figure 1 for Mastering Rate based Curriculum Learning

Figure 2 for Mastering Rate based Curriculum Learning

Figure 3 for Mastering Rate based Curriculum Learning

Figure 4 for Mastering Rate based Curriculum Learning

Abstract:Recent automatic curriculum learning algorithms, and in particular Teacher-Student algorithms, rely on the notion of learning progress, making the assumption that the good next tasks are the ones on which the learner is making the fastest progress or digress. In this work, we first propose a simpler and improved version of these algorithms. We then argue that the notion of learning progress itself has several shortcomings that lead to a low sample efficiency for the learner. We finally propose a new algorithm, based on the notion of mastering rate, that significantly outperforms learning progress-based algorithms.

Via

Access Paper or Ask Questions

BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Oct 27, 2018

Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio

Figure 1 for BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Figure 2 for BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Figure 3 for BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Figure 4 for BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

Abstract:Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.

* Preprint. Work in progress

Via

Access Paper or Ask Questions