Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

Dec 10, 2023

Yunheng Li, Zhongyu Li, Shanghua Gao, Qilong Wang, Qibin Hou, Ming-Ming Cheng

Figure 1 for A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

Figure 2 for A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

Figure 3 for A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

Figure 4 for A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

Share this with someone who'll enjoy it:

Abstract:Effectively modeling discriminative spatio-temporal information is essential for segmenting activities in long action sequences. However, we observe that existing methods are limited in weak spatio-temporal modeling capability due to two forms of decoupled modeling: (i) cascaded interaction couples spatial and temporal modeling, which over-smooths motion modeling over the long sequence, and (ii) joint-shared temporal modeling adopts shared weights to model each joint, ignoring the distinct motion patterns of different joints. We propose a Decoupled Spatio-Temporal Framework (DeST) to address the above issues. Firstly, we decouple the cascaded spatio-temporal interaction to avoid stacking multiple spatio-temporal blocks, while achieving sufficient spatio-temporal interaction. Specifically, DeST performs once unified spatial modeling and divides the spatial features into different groups of subfeatures, which then adaptively interact with temporal features from different layers. Since the different sub-features contain distinct spatial semantics, the model could learn the optimal interaction pattern at each layer. Meanwhile, inspired by the fact that different joints move at different speeds, we propose joint-decoupled temporal modeling, which employs independent trainable weights to capture distinctive temporal features of each joint. On four large-scale benchmarks of different scenes, DeST significantly outperforms current state-of-the-art methods with less computational complexity.

View paper on

Share this with someone who'll enjoy it:

Title:A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation

Paper and Code