Picture for Shengeng Tang

Shengeng Tang

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning

Add code
Feb 11, 2025
Viaarxiv icon

Efficient Vision Language Model Fine-tuning for Text-based Person Anomaly Search

Add code
Feb 05, 2025
Viaarxiv icon

Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition

Add code
Dec 22, 2024
Viaarxiv icon

Linguistics-Vision Monotonic Consistent Network for Sign Language Production

Add code
Dec 22, 2024
Viaarxiv icon

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production

Add code
Dec 19, 2024
Viaarxiv icon

Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

Add code
Dec 17, 2024
Figure 1 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 2 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 3 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 4 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Viaarxiv icon

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Add code
Dec 14, 2024
Figure 1 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Figure 2 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Figure 3 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Figure 4 for Patch-level Sounding Object Tracking for Audio-Visual Question Answering
Viaarxiv icon

Towards Pixel-Level Prediction for Gaze Following: Benchmark and Approach

Add code
Nov 30, 2024
Viaarxiv icon

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation

Add code
Nov 25, 2024
Figure 1 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Figure 2 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Figure 3 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Figure 4 for Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observation
Viaarxiv icon

Modality Alignment Meets Federated Broadcasting

Add code
Nov 24, 2024
Viaarxiv icon