Picture for Helin Wang

Helin Wang

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Add code
Sep 17, 2024
Viaarxiv icon

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer

Add code
Sep 12, 2024
Viaarxiv icon

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis

Add code
Sep 11, 2024
Figure 1 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 2 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 3 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Figure 4 for SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
Viaarxiv icon

DreamVoice: Text-Guided Voice Conversion

Add code
Jun 24, 2024
Viaarxiv icon

Noise-robust Speech Separation with Fast Generative Correction

Add code
Jun 11, 2024
Viaarxiv icon

Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback

Add code
Jun 02, 2024
Viaarxiv icon

Asynchronous and Segmented Bidirectional Encoding for NMT

Add code
Feb 19, 2024
Viaarxiv icon

Efficient Reinforcemen Learning via Decoupling Exploration and Utilization

Add code
Jan 17, 2024
Viaarxiv icon

Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

Add code
Nov 16, 2023
Viaarxiv icon

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

Add code
Oct 10, 2023
Figure 1 for DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Figure 2 for DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Figure 3 for DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction
Viaarxiv icon