Picture for Shengeng Tang

Shengeng Tang

Linguistics-Vision Monotonic Consistent Network for Sign Language Production

Add code
Dec 22, 2024
Viaarxiv icon

Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition

Add code
Dec 22, 2024
Viaarxiv icon

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production

Add code
Dec 19, 2024
Viaarxiv icon

Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

Add code
Dec 17, 2024
Figure 1 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 2 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 3 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 4 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Viaarxiv icon

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Add code
Dec 14, 2024
Figure 1 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Figure 2 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Figure 3 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Figure 4 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Viaarxiv icon

Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach

Add code
Nov 30, 2024
Viaarxiv icon

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation

Add code
Nov 25, 2024
Figure 1 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Figure 2 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Figure 3 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Figure 4 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Viaarxiv icon

Modality Alignment Meets Federated Broadcasting

Add code
Nov 24, 2024
Viaarxiv icon

Dataset Distillers Are Good Label Denoisers In the Wild

Add code
Nov 18, 2024
Figure 1 for Dataset Distillers Are Good Label Denoisers In the Wild
Figure 2 for Dataset Distillers Are Good Label Denoisers In the Wild
Figure 3 for Dataset Distillers Are Good Label Denoisers In the Wild
Figure 4 for Dataset Distillers Are Good Label Denoisers In the Wild
Viaarxiv icon

Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing

Add code
Oct 16, 2024
Viaarxiv icon