Picture for Wangyou Zhang

Wangyou Zhang

VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music

Add code
Dec 23, 2024
Viaarxiv icon

Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling

Add code
Dec 19, 2024
Figure 1 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Figure 2 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Figure 3 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Figure 4 for Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling
Viaarxiv icon

Text-To-Speech Synthesis In The Wild

Add code
Sep 13, 2024
Viaarxiv icon

Towards Robust Speech Representation Learning for Thousands of Languages

Add code
Jul 02, 2024
Viaarxiv icon

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Add code
Jun 07, 2024
Viaarxiv icon

Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement

Add code
Jun 06, 2024
Figure 1 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Figure 2 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Figure 3 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Figure 4 for Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Viaarxiv icon

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

Add code
Jan 31, 2024
Viaarxiv icon

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

Add code
Jan 30, 2024
Figure 1 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 2 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 3 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Figure 4 for ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Viaarxiv icon

Improving Design of Input Condition Invariant Speech Enhancement

Add code
Jan 25, 2024
Viaarxiv icon

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction

Add code
Oct 12, 2023
Viaarxiv icon