Abstract:Trajectory prediction plays a crucial role in improving the safety and reliability of autonomous vehicles, serving as an intermediate link between perception and planning. However, due to the highly dynamic and multimodal nature of the task, accurately predicting the future trajectory of a target vehicle remains a significant challenge. To address these challenges, we propose an Ego vehicle Planning-informed Network (EPN) for multimodal trajectory prediction. Current trajectory prediction methods typically use the historical trajectory and vehicle attributes as inputs, focusing primarily on how historical information influences the future trajectory of the target vehicle. In real-world driving scenarios, however, the future trajectory of a vehicle is influenced not only by its own historical data but also by the behavior of other vehicles on the road. To address this, we incorporate the future planned trajectory of the ego vehicle as an additional input to simulate the mutual influence between the ego vehicle's planned trajectory and the predicted trajectory of the target vehicle. Furthermore, to tackle the challenges of intention ambiguity and large prediction errors often encountered in methods based on driving intentions, we propose a target's endpoint prediction module. This module first predicts the possible endpoints of the target vehicle, then refines these predictions through a correction mechanism, and finally generates a complete multimodal predicted trajectory based on the corrected endpoints. Experimental results demonstrate that, compared to other trajectory prediction methods, EPN achieves an average reduction of 34.9%, 30.7%, and 30.4% in RMSE, ADE, and FDE evaluation metrics on the NGSIM dataset, and an average reduction of 64.6%, 64.5%, and 64.3% in RMSE, ADE, and FDE on the HighD dataset. These results highlight the strong performance of EPN in trajectory prediction.
Abstract:A stable 3D object detection model based on BEV paradigm with temporal information is very important for autonomous driving systems. However, current temporal fusion model use convolutional layer or deformable self-attention is not conducive to the exchange of global information of BEV space and has more computational cost. Recently, a newly proposed based model specialized in processing sequence called mamba has shown great potential in multiple downstream task. In this work, we proposed a mamba2-based BEV 3D object detection model named MambaBEV. We also adapt an end to end self driving paradigm to test the performance of the model. Our work performs pretty good results on nucences datasets:Our base version achieves 51.7% NDS. Our code will be available soon.