Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthias Müller-Brockhausen

EduGym: An Environment Suite for Reinforcement Learning Education

Nov 17, 2023

Thomas M. Moerland, Matthias Müller-Brockhausen, Zhao Yang, Andrius Bernatavicius, Koen Ponse, Tom Kouwenhoven, Andreas Sauter, Michiel van der Meer, Bram Renting, Aske Plaat

Abstract:Due to the empirical success of reinforcement learning, an increasing number of students study the subject. However, from our practical teaching experience, we see students entering the field (bachelor, master and early PhD) often struggle. On the one hand, textbooks and (online) lectures provide the fundamentals, but students find it hard to translate between equations and code. On the other hand, public codebases do provide practical examples, but the implemented algorithms tend to be complex, and the underlying test environments contain multiple reinforcement learning challenges at once. Although this is realistic from a research perspective, it often hinders educational conceptual understanding. To solve this issue we introduce EduGym, a set of educational reinforcement learning environments and associated interactive notebooks tailored for education. Each EduGym environment is specifically designed to illustrate a certain aspect/challenge of reinforcement learning (e.g., exploration, partial observability, stochasticity, etc.), while the associated interactive notebook explains the challenge and its possible solution approaches, connecting equations and code in a single document. An evaluation among RL students and researchers shows 86% of them think EduGym is a useful tool for reinforcement learning education. All notebooks are available from https://sites.google.com/view/edu-gym/home, while the full software package can be installed from https://github.com/RLG-Leiden/edugym.

Via

Access Paper or Ask Questions

Believable Minecraft Settlements by Means of Decentralised Iterative Planning

Sep 19, 2023

Arthur van der Staaij, Jelmer Prins, Vincent L. Prins, Julian Poelsma, Thera Smit, Matthias Müller-Brockhausen, Mike Preuss

Figure 1 for Believable Minecraft Settlements by Means of Decentralised Iterative Planning

Figure 2 for Believable Minecraft Settlements by Means of Decentralised Iterative Planning

Figure 3 for Believable Minecraft Settlements by Means of Decentralised Iterative Planning

Figure 4 for Believable Minecraft Settlements by Means of Decentralised Iterative Planning

Abstract:Procedural city generation that focuses on believability and adaptability to random terrain is a difficult challenge in the field of Procedural Content Generation (PCG). Dozens of researchers compete for a realistic approach in challenges such as the Generative Settlement Design in Minecraft (GDMC), in which our method has won the 2022 competition. This was achieved through a decentralised, iterative planning process that is transferable to similar generation processes that aims to produce "organic" content procedurally.

* 8 pages, 8 figures, to be published in "2023 IEEE Conference on Games (CoG)"

Via

Access Paper or Ask Questions

Reliable validation of Reinforcement Learning Benchmarks

Mar 02, 2022

Matthias Müller-Brockhausen, Aske Plaat, Mike Preuss

Figure 1 for Reliable validation of Reinforcement Learning Benchmarks

Figure 2 for Reliable validation of Reinforcement Learning Benchmarks

Figure 3 for Reliable validation of Reinforcement Learning Benchmarks

Figure 4 for Reliable validation of Reinforcement Learning Benchmarks

Abstract:Reinforcement Learning (RL) is one of the most dynamic research areas in Game AI and AI as a whole, and a wide variety of games are used as its prominent test problems. However, it is subject to the replicability crisis that currently affects most algorithmic AI research. Benchmarking in Reinforcement Learning could be improved through verifiable results. There are numerous benchmark environments whose scores are used to compare different algorithms, such as Atari. Nevertheless, reviewers must trust that figures represent truthful values, as it is difficult to reproduce an exact training curve. We propose improving this situation by providing access to the original experimental data to validate study results. To that end, we rely on the concept of minimal traces. These allow re-simulation of action sequences in deterministic RL environments and, in turn, enable reviewers to verify, re-use, and manually inspect experimental results without needing large compute clusters. It also permits validation of presented reward graphs, an inspection of individual episodes, and re-use of result data (baselines) for proper comparison in follow-up papers. We offer plug-and-play code that works with Gym so that our measures fit well in the existing RL and reproducibility eco-system. Our approach is freely available, easy to use, and adds minimal overhead, as minimal traces allow a data compression ratio of up to $\approx 10^4:1$ (94GB to 8MB for Atari Pong) compared to a regular MDP trace used in offline RL datasets. The paper presents proof-of-concept results for a variety of games.

Via

Access Paper or Ask Questions

Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

May 31, 2021

Matthias Müller-Brockhausen, Mike Preuss, Aske Plaat

Figure 1 for Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

Figure 2 for Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

Figure 3 for Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

Figure 4 for Procedural Content Generation: Better Benchmarks for Transfer Reinforcement Learning

Abstract:The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch. This promises quicker learning and learning more complex methods. To gain an insight into the field and to detect emerging trends, we performed a database search. We note a surprisingly late adoption of deep learning that starts in 2018. The introduction of deep learning has not yet solved the greatest challenge of TRL: generalization. Transfer between different domains works well when domains have strong similarities (e.g. MountainCar to Cartpole), and most TRL publications focus on different tasks within the same domain that have few differences. Most TRL applications we encountered compare their improvements against self-defined baselines, and the field is still missing unified benchmarks. We consider this to be a disappointing situation. For the future, we note that: (1) A clear measure of task similarity is needed. (2) Generalization needs to improve. Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress. Already Alchemy and Meta-World are emerging as interesting benchmark suites. We note that another development, the increase in procedural content generation (PCG), can improve both benchmarking and generalization in TRL.

Via

Access Paper or Ask Questions