Abstract:Most prior motion prediction endeavors in autonomous driving have inadequately encoded future scenarios, leading to predictions that may fail to accurately capture the diverse movements of agents (e.g., vehicles or pedestrians). To address this, we propose FutureNet, which explicitly integrates initially predicted trajectories into the future scenario and further encodes these future contexts to enhance subsequent forecasting. Additionally, most previous motion forecasting works have focused on predicting independent futures for each agent. However, safe and smooth autonomous driving requires accurately predicting the diverse future behaviors of numerous surrounding agents jointly in complex dynamic environments. Given that all agents occupy certain potential travel spaces and possess lane driving priority, we propose Lane Occupancy Field (LOF), a new representation with lane semantics for motion forecasting in autonomous driving. LOF can simultaneously capture the joint probability distribution of all road participants' future spatial-temporal positions. Due to the high compatibility between lane occupancy field prediction and trajectory prediction, we propose a novel network with future context encoding for the joint prediction of these two tasks. Our approach ranks 1st on two large-scale motion forecasting benchmarks: Argoverse 1 and Argoverse 2.
Abstract:Predicting the future motion of road participants is crucial for autonomous driving but is extremely challenging due to staggering motion uncertainty. Recently, most motion forecasting methods resort to the goal-based strategy, i.e., predicting endpoints of motion trajectories as conditions to regress the entire trajectories, so that the search space of solution can be reduced. However, accurate goal coordinates are hard to predict and evaluate. In addition, the point representation of the destination limits the utilization of a rich road context, leading to inaccurate prediction results in many cases. Goal area, i.e., the possible destination area, rather than goal coordinate, could provide a more soft constraint for searching potential trajectories by involving more tolerance and guidance. In view of this, we propose a new goal area-based framework, named Goal Area Network (GANet), for motion forecasting, which models goal areas rather than exact goal coordinates as preconditions for trajectory prediction, performing more robustly and accurately. Specifically, we propose a GoICrop (Goal Area of Interest) operator to effectively extract semantic lane features in goal areas and model actors' future interactions, which benefits a lot for future trajectory estimations. GANet ranks the 1st on the leaderboard of Argoverse Challenge among all public literature (till the paper submission), and its source codes will be released.