sites.google.com/view/shasta-3d-mot/home.
Multi-object tracking is a cornerstone capability of any robotic system. Most approaches follow a tracking-by-detection paradigm. However, within this framework, detectors function in a low precision-high recall regime, ensuring a low number of false-negatives while producing a high rate of false-positives. This can negatively affect the tracking component by making data association and track lifecycle management more challenging. Additionally, false-negative detections due to difficult scenarios like occlusions can negatively affect tracking performance. Thus, we propose a method that learns shape and spatio-temporal affinities between consecutive frames to better distinguish between true-positive and false-positive detections and tracks, while compensating for false-negative detections. Our method provides a probabilistic matching of detections that leads to robust data association and track lifecycle management. We quantitatively evaluate our method through ablative experiments and on the nuScenes tracking benchmark where we achieve state-of-the-art results. Our method not only estimates accurate, high-quality tracks but also decreases the overall number of false-positive and false-negative tracks. Please see our project website for source code and demo videos: