https://github.com/amitgalor18/STC_Tracker
Transformer networks have been a focus of research in many fields in recent years, being able to surpass the state-of-the-art performance in different computer vision tasks. A few attempts have been made to apply this method to the task of Multiple Object Tracking (MOT), among those the state-of-the-art was TransCenter, a transformer-based MOT architecture with dense object queries for accurately tracking all the objects while keeping reasonable runtime. TransCenter is the first center-based transformer framework for MOT, and is also among the first to show the benefits of using transformer-based architectures for MOT. In this paper we show an improvement to this tracker using post processing mechanism based in the Track-by-Detection paradigm: motion model estimation using Kalman filter and target Re-identification using an embedding network. Our new tracker shows significant improvements in the IDF1 and HOTA metrics and comparable results on the MOTA metric (70.9%, 59.8% and 75.8% respectively) on the MOTChallenge MOT17 test dataset and improvement on all 3 metrics (67.5%, 56.3% and 73.0%) on the MOT20 test dataset. Our tracker is currently ranked first among transformer-based trackers in these datasets. The code is publicly available at: