Abstract:We propose a learning-based Control Barrier Function (CBF) to reduce conservatism in collision avoidance of car-like robots. Traditional CBFs often use Euclidean distance between robots' centers as safety margin, neglecting headings and simplifying geometries to circles. While this ensures smooth, differentiable safety functions required by CBFs, it can be overly conservative in tight environments. To address this limitation, we design a heading-aware safety margin that accounts for the robots' orientations, enabling a less conservative and more accurate estimation of safe regions. Since the function computing this safety margin is non-differentiable, we approximate it with a neural network to ensure differentiability and facilitate integration with CBFs. We describe how we achieve bounded learning error and incorporate the upper bound into the CBF to provide formal safety guarantees through forward invariance. We show that our CBF is a high-order CBF with relative degree two for a system with two robots whose dynamics are modeled by the nonlinear kinematic bicycle model. Experimental results in overtaking and bypassing scenarios reveal a 33.5 % reduction in conservatism compared to traditional methods, while maintaining safety. Code: https://github.com/bassamlab/sigmarl
Abstract:Non-stationarity poses a fundamental challenge in Multi-Agent Reinforcement Learning (MARL), arising from agents simultaneously learning and altering their policies. This creates a non-stationary environment from the perspective of each individual agent, often leading to suboptimal or even unconverged learning outcomes. We propose an open-source framework named XP-MARL, which augments MARL with auxiliary prioritization to address this challenge in cooperative settings. XP-MARL is 1) founded upon our hypothesis that prioritizing agents and letting higher-priority agents establish their actions first would stabilize the learning process and thus mitigate non-stationarity and 2) enabled by our proposed mechanism called action propagation, where higher-priority agents act first and communicate their actions, providing a more stationary environment for others. Moreover, instead of using a predefined or heuristic priority assignment, XP-MARL learns priority-assignment policies with an auxiliary MARL problem, leading to a joint learning scheme. Experiments in a motion-planning scenario involving Connected and Automated Vehicles (CAVs) demonstrate that XP-MARL improves the safety of a baseline model by 84.4% and outperforms a state-of-the-art approach, which improves the baseline by only 12.8%. Code: github.com/cas-lab-munich/sigmarl
Abstract:Distributed control algorithms are known to reduce overall computation time compared to centralized control algorithms. However, they can result in inconsistent solutions leading to the violation of safety-critical constraints. Inconsistent solutions can arise when two or more agents compute concurrently while making predictions on each others control actions. To address this issue, we propose an iterative algorithm called Synchronization-Based Cooperative Distributed Model Predictive Control, which we presented in [1]. The algorithm consists of two steps: 1. computing the optimal control inputs for each agent and 2. synchronizing the predicted states across all agents. We demonstrate the efficacy of our algorithm in the control of multiple small-scale vehicles in our Cyber-Physical Mobility Lab.
Abstract:In prioritized planning for vehicles, vehicles plan trajectories in parallel or in sequence. Parallel prioritized planning offers approximately consistent computation time regardless of the number of vehicles but struggles to guarantee collision-free trajectories. Conversely, sequential prioritized planning can guarantee collision-freeness but results in increased computation time as the number of sequentially computing vehicles, which we term computation levels, grows. This number is determined by the directed coupling graph resulted from the coupling and prioritization of vehicles. In this work, we guarantee safe trajectories in parallel planning through reachability analysis. Although these trajectories are collision-free, they tend to be conservative. We address this by planning with a subset of vehicles in sequence. We formulate the problem of selecting this subset as a graph partitioning problem that allows us to independently set computation levels. Our simulations demonstrate a reduction in computation levels by approximately 64% compared to sequential prioritized planning while maintaining the solution quality.
Abstract:Connected and automated vehicles and robot swarms hold transformative potential for enhancing safety, efficiency, and sustainability in the transportation and manufacturing sectors. Extensive testing and validation of these technologies is crucial for their deployment in the real world. While simulations are essential for initial testing, they often have limitations in capturing the complex dynamics of real-world interactions. This limitation underscores the importance of small-scale testbeds. These testbeds provide a realistic, cost-effective, and controlled environment for testing and validating algorithms, acting as an essential intermediary between simulation and full-scale experiments. This work serves to facilitate researchers' efforts in identifying existing small-scale testbeds suitable for their experiments and provide insights for those who want to build their own. In addition, it delivers a comprehensive survey of the current landscape of these testbeds. We derive 62 characteristics of testbeds based on the well-known sense-plan-act paradigm and offer an online table comparing 22 small-scale testbeds based on these characteristics. The online table is hosted on our designated public webpage www.cpm-remote.de/testbeds, and we invite testbed creators and developers to contribute to it. We closely examine nine testbeds in this paper, demonstrating how the derived characteristics can be used to present testbeds. Furthermore, we discuss three ongoing challenges concerning small-scale testbeds that we identified, i.e., small-scale to full-scale transition, sustainability, and power and resource management.
Abstract:This paper introduces an open-source, decentralized framework named SigmaRL, designed to enhance both sample efficiency and generalization of multi-agent Reinforcement Learning (RL) for motion planning of connected and automated vehicles. Most RL agents exhibit a limited capacity to generalize, often focusing narrowly on specific scenarios, and are usually evaluated in similar or even the same scenarios seen during training. Various methods have been proposed to address these challenges, including experience replay and regularization. However, how observation design in RL affects sample efficiency and generalization remains an under-explored area. We address this gap by proposing five strategies to design information-dense observations, focusing on general features that are applicable to most traffic scenarios. We train our RL agents using these strategies on an intersection and evaluate their generalization through numerical experiments across completely unseen traffic scenarios, including a new intersection, an on-ramp, and a roundabout. Incorporating these information-dense observations reduces training times to under one hour on a single CPU, and the evaluation results reveal that our RL agents can effectively zero-shot generalize. Code: github.com/cas-lab-munich/SigmaRL
Abstract:Trajectory planning for autonomous cars can be addressed by primitive-based methods, which encode nonlinear dynamical system behavior into automata. In this paper, we focus on optimal trajectory planning. Since, typically, multiple criteria have to be taken into account, multiobjective optimization problems have to be solved. For the resulting Pareto-optimal motion primitives, we introduce a universal automaton, which can be reduced or reconfigured according to prioritized criteria during planning. We evaluate a corresponding multi-vehicle planning scenario with both simulations and laboratory experiments.
Abstract:This paper presents a localization system that uses infrared beacons and a camera equipped with an optical band-pass filter. Our system can reliably detect and identify individual beacons at 100m distance regardless of lighting conditions. We describe the camera and beacon design as well as the image processing pipeline in detail. In our experiments, we investigate and demonstrate the ability of the system to recognize our beacons in both daytime and nighttime conditions. High precision localization is a key enabler for automated vehicles but remains unsolved, despite strong recent improvements. Our low-cost, infrastructure-based approach helps solve the localization problem. All datasets are made available.
Abstract:We introduce our Cyber-Physical Mobility Lab (CPM Lab), a development environment for networked and autonomous vehicles. It consists of 20 model-scale vehicles for experiments and a simulation environment. We show our four-layered architecture that enables the seamless use of the same software in simulations and in experiments without any adaptions. A Data Distribution Service (DDS) based middleware allows to adapt the number of vehicles during experiments in a seamless manner. Experiments with the 20 vehicles can be extended by unlimited additional simulated vehicles. Another layer is responsible for synchronizing all entities following a logical execution time approach. We pursue an open policy in the CPM Lab and will publish the entire code as well as construction plans online. Additionally, we will offer a remote-access to the CPM Lab using a web interface. The remote-access will be publicly available. The CPM Lab allows researchers as well as students from different disciplines to see their ideas develop into reality.
Abstract:This paper presents the $\mathrm{\mu}$Car, a 1:18 model-scale vehicle with Ackermann steering geometry developed for experiments in networked and autonomous driving in research and education. The vehicle is open source, moderately costed and highly flexible, which allows for many applications. It is equipped with an inertial measurement unit and an odometer and obtains its pose via WLAN from an indoor positioning system. The two supported operating modes for controlling the vehicle are (1) computing control inputs on external hardware, transmitting them via WLAN and applying received inputs to the actuators and (2) transmitting a reference trajectory via WLAN, which is then followed by a controller running on the onboard Raspberry Pi Zero W. The design allows identical vehicles to be used at the same time in order to conduct experiments with a large amount of networked agents.