Picture for Chenpeng Du

Chenpeng Du

Recent Advances in Discrete Speech Tokens: A Review

Add code
Feb 10, 2025
Viaarxiv icon

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Add code
Feb 06, 2025
Viaarxiv icon

Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective

Add code
Dec 22, 2024
Viaarxiv icon

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec

Add code
Oct 21, 2024
Viaarxiv icon

vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders

Add code
Sep 03, 2024
Figure 1 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 2 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 3 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Figure 4 for vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders
Viaarxiv icon

Language Model Can Listen While Speaking

Add code
Aug 05, 2024
Figure 1 for Language Model Can Listen While Speaking
Figure 2 for Language Model Can Listen While Speaking
Figure 3 for Language Model Can Listen While Speaking
Figure 4 for Language Model Can Listen While Speaking
Viaarxiv icon

AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding

Add code
May 06, 2024
Viaarxiv icon

Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech

Add code
Apr 30, 2024
Viaarxiv icon

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting

Add code
Apr 29, 2024
Viaarxiv icon

The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge

Add code
Apr 10, 2024
Viaarxiv icon