Abstract:Reinforcement learning has emerged as an important approach for autonomous driving. A reward function is used in reinforcement learning to establish the learned skill objectives and guide the agent toward the optimal policy. Since autonomous driving is a complex domain with partly conflicting objectives with varying degrees of priority, developing a suitable reward function represents a fundamental challenge. This paper aims to highlight the gap in such function design by assessing different proposed formulations in the literature and dividing individual objectives into Safety, Comfort, Progress, and Traffic Rules compliance categories. Additionally, the limitations of the reviewed reward functions are discussed, such as objectives aggregation and indifference to driving context. Furthermore, the reward categories are frequently inadequately formulated and lack standardization. This paper concludes by proposing future research that potentially addresses the observed shortcomings in rewards, including a reward validation framework and structured rewards that are context-aware and able to resolve conflicts.
Abstract:Reinforcement Learning is a highly active research field with promising advancements. In the field of autonomous driving, however, often very simple scenarios are being examined. Common approaches use non-interpretable control commands as the action space and unstructured reward designs which lack structure. In this work, we introduce Informed Reinforcement Learning, where a structured rulebook is integrated as a knowledge source. We learn trajectories and asses them with a situation-aware reward design, leading to a dynamic reward which allows the agent to learn situations which require controlled traffic rule exceptions. Our method is applicable to arbitrary RL models. We successfully demonstrate high completion rates of complex scenarios with recent model-based agents.
Abstract:Accurately forecasting the motion of traffic actors is crucial for the deployment of autonomous vehicles at a large scale. Current trajectory forecasting approaches primarily concentrate on optimizing a loss function with a specific metric, which can result in predictions that do not adhere to physical laws or violate external constraints. Our objective is to incorporate explicit knowledge priors that allow a network to forecast future trajectories in compliance with both the kinematic constraints of a vehicle and the geometry of the driving environment. To achieve this, we introduce a non-parametric pruning layer and attention layers to integrate the defined knowledge priors. Our proposed method is designed to ensure reachability guarantees for traffic actors in both complex and dynamic situations. By conditioning the network to follow physical laws, we can obtain accurate and safe predictions, essential for maintaining autonomous vehicles' safety and efficiency in real-world settings.In summary, this paper presents concepts that prevent off-road predictions for safe and reliable motion forecasting by incorporating knowledge priors into the training process.
Abstract:Effective and flexible allocation of visual attention is key for pedestrians who have to navigate to a desired goal under different conditions of urgency and safety preferences. While automatic modelling of pedestrian attention holds great promise to improve simulations of pedestrian behavior, current saliency prediction approaches mostly focus on generic free-viewing scenarios and do not reflect the specific challenges present in pedestrian attention prediction. In this paper, we present Context-SalNET, a novel encoder-decoder architecture that explicitly addresses three key challenges of visual attention prediction in pedestrians: First, Context-SalNET explicitly models the context factors urgency and safety preference in the latent space of the encoder-decoder model. Second, we propose the exponentially weighted mean squared error loss (ew-MSE) that is able to better cope with the fact that only a small part of the ground truth saliency maps consist of non-zero entries. Third, we explicitly model epistemic uncertainty to account for the fact that training data for pedestrian attention prediction is limited. To evaluate Context-SalNET, we recorded the first dataset of pedestrian visual attention in VR that includes explicit variation of the context factors urgency and safety preference. Context-SalNET achieves clear improvements over state-of-the-art saliency prediction approaches as well as over ablations. Our novel dataset will be made fully available and can serve as a valuable resource for further research on pedestrian attention prediction.