Picture for Xinyuan Qian

Xinyuan Qian

SAV-SE: Scene-aware Audio-Visual Speech Enhancement with Selective State Space Model

Add code
Nov 12, 2024
Viaarxiv icon

Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection

Add code
Sep 11, 2024
Viaarxiv icon

Text-Queried Target Sound Event Localization

Add code
Jun 23, 2024
Figure 1 for Text-Queried Target Sound Event Localization
Figure 2 for Text-Queried Target Sound Event Localization
Figure 3 for Text-Queried Target Sound Event Localization
Figure 4 for Text-Queried Target Sound Event Localization
Viaarxiv icon

An Exploration of Length Generalization in Transformer-Based Speech Enhancement

Add code
Jun 17, 2024
Viaarxiv icon

Mamba in Speech: Towards an Alternative to Self-Attention

Add code
May 22, 2024
Viaarxiv icon

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

Add code
Apr 29, 2024
Viaarxiv icon

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

Add code
Apr 01, 2024
Viaarxiv icon

Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions

Add code
Oct 23, 2023
Viaarxiv icon

LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism

Add code
Oct 17, 2023
Viaarxiv icon

Audio Visual Speaker Localization from EgoCentric Views

Add code
Sep 28, 2023
Figure 1 for Audio Visual Speaker Localization from EgoCentric Views
Figure 2 for Audio Visual Speaker Localization from EgoCentric Views
Figure 3 for Audio Visual Speaker Localization from EgoCentric Views
Figure 4 for Audio Visual Speaker Localization from EgoCentric Views
Viaarxiv icon