Picture for Yushen Chen

Yushen Chen

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis

Add code
Jan 20, 2026
Viaarxiv icon

Semantic-VAE: Semantic-Alignment Latent Representation for Better Speech Synthesis

Add code
Sep 26, 2025
Viaarxiv icon

AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook

Add code
Sep 26, 2025
Viaarxiv icon

Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling

Add code
May 26, 2025
Figure 1 for Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling
Figure 2 for Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling
Figure 3 for Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling
Figure 4 for Accelerating Flow-Matching-Based Text-to-Speech via Empirically Pruned Step Sampling
Viaarxiv icon

Towards Flow-Matching-based TTS without Classifier-Free Guidance

Add code
Apr 29, 2025
Viaarxiv icon

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Add code
Oct 09, 2024
Figure 1 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Figure 2 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Figure 3 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Figure 4 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Viaarxiv icon