Abstract:Recent advances in trajectory prediction have shown that explicit reasoning about agents' intent is important to accurately forecast their motion. However, the current research activities are not directly applicable to intelligent and safety critical systems. This is mainly because very few public datasets are available, and they only consider pedestrian-specific intents for a short temporal horizon from a restricted egocentric view. To this end, we propose LOKI (LOng term and Key Intentions), a novel large-scale dataset that is designed to tackle joint trajectory and intention prediction for heterogeneous traffic agents (pedestrians and vehicles) in an autonomous driving setting. The LOKI dataset is created to discover several factors that may affect intention, including i) agent's own will, ii) social interactions, iii) environmental constraints, and iv) contextual information. We also propose a model that jointly performs trajectory and intention prediction, showing that recurrently reasoning about intention can assist with trajectory prediction. We show our method outperforms state-of-the-art trajectory prediction methods by upto $27\%$ and also provide a baseline for frame-wise intention estimation.
Abstract:Predicting the future motion of actors in a traffic scene is a crucial part of any autonomous driving system. Recent research in this area has focused on trajectory prediction approaches that optimize standard trajectory error metrics. In this work, we describe three important properties -- physical realism guarantees, system maintainability, and sample efficiency -- which we believe are equally important for developing a self-driving system that can operate safely and practically in the real world. Furthermore, we introduce PTNet (PathTrackingNet), a novel approach for vehicle trajectory prediction that is a hybrid of the classical pure pursuit path tracking algorithm and modern graph-based neural networks. By combining a structured robotics technique with a flexible learning approach, we are able to produce a system that not only achieves the same level of performance as other state-of-the-art methods on traditional trajectory error metrics, but also provides strong guarantees about the physical realism of the predicted trajectories while requiring half the amount of data. We believe focusing on this new class of hybrid approaches is an useful direction for developing and maintaining a safety-critical autonomous driving system.
Abstract:Human trajectory forecasting is an inherently multi-modal problem. Uncertainty in future trajectories stems from two sources: (a) sources that are known to the agent but unknown to the model, such as long term goals and (b)sources that are unknown to both the agent & the model, such as intent of other agents & irreducible randomness indecisions. We propose to factorize this uncertainty into its epistemic & aleatoric sources. We model the epistemic un-certainty through multimodality in long term goals and the aleatoric uncertainty through multimodality in waypoints& paths. To exemplify this dichotomy, we also propose a novel long term trajectory forecasting setting, with prediction horizons upto a minute, an order of magnitude longer than prior works. Finally, we presentY-net, a scene com-pliant trajectory forecasting network that exploits the pro-posed epistemic & aleatoric structure for diverse trajectory predictions across long prediction horizons.Y-net significantly improves previous state-of-the-art performance on both (a) The well studied short prediction horizon settings on the Stanford Drone & ETH/UCY datasets and (b) The proposed long prediction horizon setting on the re-purposed Stanford Drone & Intersection Drone datasets.
Abstract:Human trajectory forecasting with multiple socially interacting agents is of critical importance for autonomous navigation in human environments, e.g., for self-driving cars and social robots. In this work, we present Predicted Endpoint Conditioned Network (PECNet) for flexible human trajectory prediction. PECNet infers distant trajectory endpoints to assist in long-range multi-modal trajectory prediction. A novel non-local social pooling layer enables PECNet to infer diverse yet socially compliant trajectories. Additionally, we present a simple "truncation-trick" for improving few-shot multi-modal trajectory prediction performance. We show that PECNet improves state-of-the-art performance on the Stanford Drone trajectory prediction benchmark by ~19.5% and on the ETH/UCY benchmark by ~40.8%.