Picture for Jianwu Dang

Jianwu Dang

Progressive Residual Extraction based Pre-training for Speech Representation Learning

Add code
Aug 31, 2024
Viaarxiv icon

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

Add code
Aug 11, 2024
Viaarxiv icon

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Add code
Jun 13, 2024
Viaarxiv icon

A Refining Underlying Information Framework for Monaural Speech Enhancement

Add code
Dec 24, 2023
Viaarxiv icon

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Add code
Dec 22, 2023
Viaarxiv icon

Ahpatron: A New Budgeted Online Kernel Learning Machine with Tighter Mistake Bound

Add code
Dec 12, 2023
Viaarxiv icon

High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models

Add code
Sep 27, 2023
Viaarxiv icon

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

Add code
Sep 06, 2023
Viaarxiv icon

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

Add code
Jul 28, 2023
Viaarxiv icon

Rethinking the visual cues in audio-visual speaker extraction

Add code
Jun 05, 2023
Viaarxiv icon