Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection

Dec 09, 2019

Matteo Tomei, Lorenzo Baraldi, Simone Calderara, Simone Bronzin, Rita Cucchiara

Figure 1 for STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection

Figure 2 for STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection

Figure 3 for STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection

Figure 4 for STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection

Share this with someone who'll enjoy it:

Abstract:Spatio-temporal action localization is a challenging yet fascinating task that aims to detect and classify human actions in video clips. In this paper, we develop a high-level video understanding module which can encode interactions between actors and objects both in space and time. In our formulation, spatio-temporal relationships are learned by performing self-attention operations on a graph structure connecting entities from consecutive clips. Noticeably, the use of graph learning is unprecedented for this task. From a computational point of view, the proposed module is backbone independent by design and does not need end-to-end training. When tested on the AVA dataset, it demonstrates a 10-16% relative mAP improvement over the baseline. Further, it can outperform or bring performances comparable to state-of-the-art models which require heavy end-to-end and synchronized training on multiple GPUs. Code is publicly available at: https://github.com/aimagelab/STAGE_action_detection.

View paper on

Share this with someone who'll enjoy it:

Title:STAGE: Spatio-Temporal Attention on Graph Entities for Video Action Detection

Paper and Code