Picture for Junjie Li

Junjie Li

MoMuSE: Momentum Multi-modal Target Speaker Extraction for Real-time Scenarios with Impaired Visual Cues

Add code
Dec 11, 2024
Viaarxiv icon

GiVE: Guiding Visual Encoder to Perceive Overlooked Information

Add code
Oct 26, 2024
Viaarxiv icon

Multi-Level Speaker Representation for Target Speaker Extraction

Add code
Oct 21, 2024
Viaarxiv icon

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction

Add code
Sep 24, 2024
Figure 1 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 2 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 3 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 4 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Viaarxiv icon

On the effectiveness of enrollment speech augmentation for Target Speaker Extraction

Add code
Sep 15, 2024
Figure 1 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Figure 2 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Figure 3 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Figure 4 for On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
Viaarxiv icon

vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders

Add code
Sep 03, 2024
Figure 1 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 2 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 3 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 4 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Viaarxiv icon

Cross-Modal Spherical Aggregation for Weakly Supervised Remote Sensing Shadow Removal

Add code
Jun 25, 2024
Viaarxiv icon

ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency

Add code
Jun 04, 2024
Viaarxiv icon

A Unified Search and Recommendation Framework Based on Multi-Scenario Learning for Ranking in E-commerce

Add code
May 17, 2024
Viaarxiv icon

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

Add code
Apr 29, 2024
Figure 1 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 2 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 3 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 4 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Viaarxiv icon