Abstract:We address the decision-making capability within an end-to-end planning framework that focuses on motion prediction, decision-making, and trajectory planning. Specifically, we formulate decision-making and trajectory planning as a differentiable nonlinear optimization problem, which ensures compatibility with learning-based modules to establish an end-to-end trainable architecture. This optimization introduces explicit objectives related to safety, traveling efficiency, and riding comfort, guiding the learning process in our proposed pipeline. Intrinsic constraints resulting from the decision-making task are integrated into the optimization formulation and preserved throughout the learning process. By integrating the differentiable optimizer with a neural network predictor, the proposed framework is end-to-end trainable, aligning various driving tasks with ultimate performance goals defined by the optimization objectives. The proposed framework is trained and validated using the Waymo Open Motion dataset. The open-loop testing reveals that while the planning outcomes using our method do not always resemble the expert trajectory, they consistently outperform baseline approaches with improved safety, traveling efficiency, and riding comfort. The closed-loop testing further demonstrates the effectiveness of optimizing decisions and improving driving performance. Ablation studies demonstrate that the initialization provided by the learning-based prediction module is essential for the convergence of the optimizer as well as the overall driving performance.
Abstract:The evaluation and training of autonomous driving systems require diverse and scalable corner cases. However, most existing scene generation methods lack controllability, accuracy, and versatility, resulting in unsatisfactory generation results. To address this problem, we propose Dragtraffic, a generalized, point-based, and controllable traffic scene generation framework based on conditional diffusion. Dragtraffic enables non-experts to generate a variety of realistic driving scenarios for different types of traffic agents through an adaptive mixture expert architecture. We use a regression model to provide a general initial solution and a refinement process based on the conditional diffusion model to ensure diversity. User-customized context is introduced through cross-attention to ensure high controllability. Experiments on a real-world driving dataset show that Dragtraffic outperforms existing methods in terms of authenticity, diversity, and freedom.
Abstract:Accurate trajectory prediction is crucial for safe and efficient autonomous driving, but handling partial observations presents significant challenges. To address this, we propose a novel trajectory prediction framework called Partial Observations Prediction (POP) for congested urban road scenarios. The framework consists of two stages: self-supervised learning (SSL) and feature distillation. In SSL, a reconstruction branch reconstructs the hidden history of partial observations using a mask procedure and reconstruction head. The feature distillation stage transfers knowledge from a fully observed teacher model to a partially observed student model, improving prediction accuracy. POP achieves comparable results to top-performing methods in open-loop experiments and outperforms the baseline method in closed-loop simulations, including safety metrics. Qualitative results illustrate the superiority of POP in providing reasonable and safe trajectory predictions.