Picture for Hui-Peng Du

Hui-Peng Du

CFMDCTCodec: A Low-Bitrate Neural Speech Codec with Noise-Prior-aware Conditional Flow Matching for MDCT-Spectral Enhancement

Add code
May 26, 2026
Viaarxiv icon

Ultra-Low-Bitrate Mel-Spectrogram-based Neural Speech Coding with Flow-Matching-based Refinement and Vocoding-driven Reconstruction

Add code
May 25, 2026
Viaarxiv icon

CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction

Add code
Jan 19, 2026
Viaarxiv icon

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis

Add code
Sep 18, 2025
Figure 1 for DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
Figure 2 for DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
Figure 3 for DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis
Viaarxiv icon

Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding

Add code
Sep 04, 2025
Viaarxiv icon

Is GAN Necessary for Mel-Spectrogram-based Neural Vocoder?

Add code
Aug 11, 2025
Viaarxiv icon

Vision-Integrated High-Quality Neural Speech Coding

Add code
May 29, 2025
Figure 1 for Vision-Integrated High-Quality Neural Speech Coding
Figure 2 for Vision-Integrated High-Quality Neural Speech Coding
Figure 3 for Vision-Integrated High-Quality Neural Speech Coding
Figure 4 for Vision-Integrated High-Quality Neural Speech Coding
Viaarxiv icon

Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising

Add code
May 22, 2025
Figure 1 for Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
Figure 2 for Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
Figure 3 for Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
Figure 4 for Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising
Viaarxiv icon

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis

Add code
Dec 22, 2024
Viaarxiv icon

A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions

Add code
Nov 19, 2024
Viaarxiv icon