Picture for Zhikang Niu

Zhikang Niu

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training

Add code
Dec 20, 2024
Viaarxiv icon

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Add code
Oct 09, 2024
Figure 1 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Figure 2 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Figure 3 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Figure 4 for F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Viaarxiv icon

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

Add code
Jan 30, 2024
Viaarxiv icon

Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

Add code
Sep 29, 2023
Viaarxiv icon