Abstract:We present DriveGPT, a scalable behavior model for autonomous driving. We model driving as a sequential decision making task, and learn a transformer model to predict future agent states as tokens in an autoregressive fashion. We scale up our model parameters and training data by multiple orders of magnitude, enabling us to explore the scaling properties in terms of dataset size, model parameters, and compute. We evaluate DriveGPT across different scales in a planning task, through both quantitative metrics and qualitative examples including closed-loop driving in complex real-world scenarios. In a separate prediction task, DriveGPT outperforms a state-of-the-art baseline and exhibits improved performance by pretraining on a large-scale dataset, further validating the benefits of data scaling.
Abstract:Robust data association is critical for analysis of long-term motion trajectories in complex scenes. In its absence, trajectory precision suffers due to periods of kinematic ambiguity degrading the quality of follow-on analysis. Common optimization-based approaches often neglect uncertainty quantification arising from these events. Consequently, we propose the Joint Posterior Tracker (JPT), a Bayesian multi-object tracking algorithm that robustly reasons over the posterior of associations and trajectories. Novel, permutation-based proposals are crafted for exploration of posterior modes that correspond to plausible association hypotheses. JPT exhibits more accurate uncertainty representation of data associations with superior performance on standard metrics when compared to existing baselines. We also show the utility of JPT applied to automatic scheduling of user-in-the-loop annotations for improved trajectory quality.