Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Aug 04, 2022

Xinyu Lin, Jinxing Li, Zeyu Ma, Huafeng Li, Shuang Li, Kaixiong Xu, Guangming Lu, David Zhang

Figure 1 for Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Figure 2 for Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Figure 3 for Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Figure 4 for Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Share this with someone who'll enjoy it:

Abstract:Thanks for the cross-modal retrieval techniques, visible-infrared (RGB-IR) person re-identification (Re-ID) is achieved by projecting them into a common space, allowing person Re-ID in 24-hour surveillance systems. However, with respect to the probe-to-gallery, almost all existing RGB-IR based cross-modal person Re-ID methods focus on image-to-image matching, while the video-to-video matching which contains much richer spatial- and temporal-information remains under-explored. In this paper, we primarily study the video-based cross-modal person Re-ID method. To achieve this task, a video-based RGB-IR dataset is constructed, in which 927 valid identities with 463,259 frames and 21,863 tracklets captured by 12 RGB/IR cameras are collected. Based on our constructed dataset, we prove that with the increase of frames in a tracklet, the performance does meet more enhancement, demonstrating the significance of video-to-video matching in RGB-IR person Re-ID. Additionally, a novel method is further proposed, which not only projects two modalities to a modal-invariant subspace, but also extracts the temporal-memory for motion-invariant. Thanks to these two strategies, much better results are achieved on our video-based cross-modal person Re-ID. The code and dataset are released at: https://github.com/VCMproject233/MITML.

* Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 20973-20982

View paper on

Share this with someone who'll enjoy it:

Title:Learning Modal-Invariant and Temporal-Memory for Video-based Visible-Infrared Person Re-Identification

Paper and Code