Picture for Sijie Song

Sijie Song

Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos

Add code
Mar 23, 2021
Figure 1 for Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos
Figure 2 for Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos
Figure 3 for Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos
Figure 4 for Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos
Viaarxiv icon

MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition

Add code
Oct 14, 2020
Figure 1 for MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition
Figure 2 for MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition
Figure 3 for MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition
Figure 4 for MS$^2$L: Multi-Task Self-Supervised Learning for Skeleton Based Action Recognition
Viaarxiv icon

Fashion Meets Computer Vision: A Survey

Add code
Mar 31, 2020
Figure 1 for Fashion Meets Computer Vision: A Survey
Figure 2 for Fashion Meets Computer Vision: A Survey
Figure 3 for Fashion Meets Computer Vision: A Survey
Figure 4 for Fashion Meets Computer Vision: A Survey
Viaarxiv icon

Modality Compensation Network: Cross-Modal Adaptation for Action Recognition

Add code
Jan 31, 2020
Figure 1 for Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
Figure 2 for Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
Figure 3 for Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
Figure 4 for Modality Compensation Network: Cross-Modal Adaptation for Action Recognition
Viaarxiv icon

Unsupervised Person Image Generation with Semantic Parsing Transformation

Add code
Apr 18, 2019
Figure 1 for Unsupervised Person Image Generation with Semantic Parsing Transformation
Figure 2 for Unsupervised Person Image Generation with Semantic Parsing Transformation
Figure 3 for Unsupervised Person Image Generation with Semantic Parsing Transformation
Figure 4 for Unsupervised Person Image Generation with Semantic Parsing Transformation
Viaarxiv icon

Temporal Bilinear Networks for Video Action Recognition

Add code
Nov 25, 2018
Figure 1 for Temporal Bilinear Networks for Video Action Recognition
Figure 2 for Temporal Bilinear Networks for Video Action Recognition
Figure 3 for Temporal Bilinear Networks for Video Action Recognition
Figure 4 for Temporal Bilinear Networks for Video Action Recognition
Viaarxiv icon

PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding

Add code
Mar 28, 2017
Figure 1 for PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding
Figure 2 for PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding
Figure 3 for PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding
Figure 4 for PKU-MMD: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding
Viaarxiv icon

An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data

Add code
Nov 18, 2016
Figure 1 for An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Figure 2 for An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Figure 3 for An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Figure 4 for An End-to-End Spatio-Temporal Attention Model for Human Action Recognition from Skeleton Data
Viaarxiv icon