Picture for Yiwei Guo

Yiwei Guo

Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding

Add code
Oct 29, 2024
Figure 1 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 2 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 3 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Figure 4 for Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding
Viaarxiv icon

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

Add code
Oct 21, 2024
Viaarxiv icon

TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration

Add code
Oct 16, 2024
Viaarxiv icon

vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders

Add code
Sep 03, 2024
Figure 1 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 2 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 3 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 4 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Viaarxiv icon

DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation

Add code
Jul 18, 2024
Viaarxiv icon

On the Effectiveness of Acoustic BPE in Decoder-Only TTS

Add code
Jul 04, 2024
Viaarxiv icon

Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech

Add code
Apr 30, 2024
Viaarxiv icon

StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

Add code
Apr 23, 2024
Viaarxiv icon

The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge

Add code
Apr 10, 2024
Viaarxiv icon

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

Add code
Jan 30, 2024
Viaarxiv icon