https://github.com/seawee1/driver-dojo.
Reinforcement learning (RL) has shown to reach super human-level performance across a wide range of tasks. However, unlike supervised machine learning, learning strategies that generalize well to a wide range of situations remains one of the most challenging problems for real-world RL. Autonomous driving (AD) provides a multi-faceted experimental field, as it is necessary to learn the correct behavior over many variations of road layouts and large distributions of possible traffic situations, including individual driver personalities and hard-to-predict traffic events. In this paper we propose a challenging benchmark for generalizable RL for AD based on a configurable, flexible, and performant code base. Our benchmark uses a catalog of randomized scenario generators, including multiple mechanisms for road layout and traffic variations, different numerical and visual observation types, distinct action spaces, diverse vehicle models, and allows for use under static scenario definitions. In addition to purely algorithmic insights, our application-oriented benchmark also enables a better understanding of the impact of design decisions such as action and observation space on the generalizability of policies. Our benchmark aims to encourage researchers to propose solutions that are able to successfully generalize across scenarios, a task in which current RL methods fail. The code for the benchmark is available at