Picture for Jianwei Yu

Jianwei Yu

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction

Add code
Sep 24, 2024
Figure 1 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 2 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 3 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Figure 4 for WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
Viaarxiv icon

Comparing Discrete and Continuous Space LLMs for Speech Recognition

Add code
Sep 01, 2024
Figure 1 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Figure 2 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Figure 3 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Figure 4 for Comparing Discrete and Continuous Space LLMs for Speech Recognition
Viaarxiv icon

Gull: A Generative Multifunctional Audio Codec

Add code
Apr 07, 2024
Viaarxiv icon

Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings

Add code
Jan 29, 2024
Figure 1 for Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
Figure 2 for Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
Figure 3 for Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
Figure 4 for Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings
Viaarxiv icon

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation

Add code
Dec 24, 2023
Viaarxiv icon

SECap: Speech Emotion Captioning with Large Language Model

Add code
Dec 23, 2023
Viaarxiv icon

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction

Add code
Oct 15, 2023
Viaarxiv icon

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

Add code
Sep 27, 2023
Viaarxiv icon

AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data

Add code
Sep 25, 2023
Viaarxiv icon

Improved Factorized Neural Transducer Model For text-only Domain Adaptation

Add code
Sep 18, 2023
Viaarxiv icon