Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Temporal Saliency Query Network for Efficient Video Recognition

Jul 21, 2022

Boyang Xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han

Figure 1 for Temporal Saliency Query Network for Efficient Video Recognition

Figure 2 for Temporal Saliency Query Network for Efficient Video Recognition

Figure 3 for Temporal Saliency Query Network for Efficient Video Recognition

Figure 4 for Temporal Saliency Query Network for Efficient Video Recognition

Share this with someone who'll enjoy it:

Abstract:Efficient video recognition is a hot-spot research topic with the explosive growth of multimedia data on the Internet and mobile devices. Most existing methods select the salient frames without awareness of the class-specific saliency scores, which neglect the implicit association between the saliency of frames and its belonging category. To alleviate this issue, we devise a novel Temporal Saliency Query (TSQ) mechanism, which introduces class-specific information to provide fine-grained cues for saliency measurement. Specifically, we model the class-specific saliency measuring process as a query-response task. For each category, the common pattern of it is employed as a query and the most salient frames are responded to it. Then, the calculated similarities are adopted as the frame saliency scores. To achieve it, we propose a Temporal Saliency Query Network (TSQNet) that includes two instantiations of the TSQ mechanism based on visual appearance similarities and textual event-object relations. Afterward, cross-modality interactions are imposed to promote the information exchange between them. Finally, we use the class-specific saliencies of the most confident categories generated by two modalities to perform the selection of salient frames. Extensive experiments demonstrate the effectiveness of our method by achieving state-of-the-art results on ActivityNet, FCVID and Mini-Kinetics datasets. Our project page is at https://lawrencexia2008.github.io/projects/tsqnet .

* Accepted by ECCV 2022

View paper on

Share this with someone who'll enjoy it:

Title:Temporal Saliency Query Network for Efficient Video Recognition

Paper and Code