Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Spatiotemporal Transformer for Video-based Person Re-identification

Mar 30, 2021

Tianyu Zhang, Longhui Wei, Lingxi Xie, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

Figure 1 for Spatiotemporal Transformer for Video-based Person Re-identification

Figure 2 for Spatiotemporal Transformer for Video-based Person Re-identification

Figure 3 for Spatiotemporal Transformer for Video-based Person Re-identification

Figure 4 for Spatiotemporal Transformer for Video-based Person Re-identification

Share this with someone who'll enjoy it:

Abstract:Recently, the Transformer module has been transplanted from natural language processing to computer vision. This paper applies the Transformer to video-based person re-identification, where the key issue is to extract the discriminative information from a tracklet. We show that, despite the strong learning ability, the vanilla Transformer suffers from an increased risk of over-fitting, arguably due to a large number of attention parameters and insufficient training data. To solve this problem, we propose a novel pipeline where the model is pre-trained on a set of synthesized video data and then transferred to the downstream domains with the perception-constrained Spatiotemporal Transformer (STT) module and Global Transformer (GT) module. The derived algorithm achieves significant accuracy gain on three popular video-based person re-identification benchmarks, MARS, DukeMTMC-VideoReID, and LS-VID, especially when the training and testing data are from different domains. More importantly, our research sheds light on the application of the Transformer on highly-structured visual data.

* 10 pages, 7 figures

View paper on

Share this with someone who'll enjoy it:

Title:Spatiotemporal Transformer for Video-based Person Re-identification

Paper and Code