Abstract:In automated driving, predicting trajectories of surrounding vehicles supports reasoning about scene dynamics and enables safe planning for the ego vehicle. However, existing models handle predictions as an instantaneous task of forecasting future trajectories based on observed information. As time proceeds, the next prediction is made independently of the previous one, which means that the model cannot correct its errors during inference and will repeat them. To alleviate this problem and better leverage temporal data, we propose a novel retrospection technique. Through training on closed-loop rollouts the model learns to use aggregated feedback. Given new observations it reflects on previous predictions and analyzes its errors to improve the quality of subsequent predictions. Thus, the model can learn to correct systematic errors during inference. Comprehensive experiments on nuScenes and Argoverse demonstrate a considerable decrease in minimum Average Displacement Error of up to 31.9% compared to the state-of-the-art baseline without retrospection. We further showcase the robustness of our technique by demonstrating a better handling of out-of-distribution scenarios with undetected road-users.
Abstract:Monocular 3D lane detection has become a fundamental problem in the context of autonomous driving, which comprises the tasks of finding the road surface and locating lane markings. One major challenge lies in a flexible but robust line representation capable of modeling complex lane structures, while still avoiding unpredictable behavior. While previous methods rely on fully data-driven approaches, we instead introduce a novel approach LaneCPP that uses a continuous 3D lane detection model leveraging physical prior knowledge about the lane structure and road geometry. While our sophisticated lane model is capable of modeling complex road structures, it also shows robust behavior since physical constraints are incorporated by means of a regularization scheme that can be analytically applied to our parametric representation. Moreover, we incorporate prior knowledge about the road geometry into the 3D feature space by modeling geometry-aware spatial features, guiding the network to learn an internal road surface representation. In our experiments, we show the benefits of our contributions and prove the meaningfulness of using priors to make 3D lane detection more robust. The results show that LaneCPP achieves state-of-the-art performance in terms of F-Score and geometric errors.
Abstract:Planning the trajectory of the controlled ego vehicle is a key challenge in automated driving. As for human drivers, predicting the motions of surrounding vehicles is important to plan the own actions. Recent motion prediction methods utilize equivariant neural networks to exploit geometric symmetries in the scene. However, no existing method combines motion prediction and trajectory planning in a joint step while guaranteeing equivariance under roto-translations of the input space. We address this gap by proposing a lightweight equivariant planning model that generates multi-modal joint predictions for all vehicles and selects one mode as the ego plan. The equivariant network design improves sample efficiency, guarantees output stability, and reduces model parameters. We further propose equivariant route attraction to guide the ego vehicle along a high-level route provided by an off-the-shelf GPS navigation system. This module creates a momentum from embedded vehicle positions toward the route in latent space while keeping the equivariance property. Route attraction enables goal-oriented behavior without forcing the vehicle to stick to the exact route. We conduct experiments on the challenging nuScenes dataset to investigate the capability of our planner. The results show that the planned trajectory is stable under roto-translations of the input scene which demonstrates the equivariance of our model. Despite using only a small split of the dataset for training, our method improves L2 distance at 3 s by 20.6 % and surpasses the state of the art.
Abstract:Deep neural networks tend to make overconfident predictions and often require additional detectors for misclassifications, particularly for safety-critical applications. Existing detection methods usually only focus on adversarial attacks or out-of-distribution samples as reasons for false predictions. However, generalization errors occur due to diverse reasons often related to poorly learning relevant invariances. We therefore propose GIT, a holistic approach for the detection of generalization errors that combines the usage of gradient information and invariance transformations. The invariance transformations are designed to shift misclassified samples back into the generalization area of the neural network, while the gradient information measures the contradiction between the initial prediction and the corresponding inherent computations of the neural network using the transformed sample. Our experiments demonstrate the superior performance of GIT compared to the state-of-the-art on a variety of network architectures, problem setups and perturbation types.