Picture for Yongmao Zhang

Yongmao Zhang

Text-aware and Context-aware Expressive Audiobook Speech Synthesis

Add code
Jun 12, 2024
Viaarxiv icon

Accent-VITS:accent transfer for end-to-end TTS

Add code
Dec 29, 2023
Figure 1 for Accent-VITS:accent transfer for end-to-end TTS
Figure 2 for Accent-VITS:accent transfer for end-to-end TTS
Figure 3 for Accent-VITS:accent transfer for end-to-end TTS
Viaarxiv icon

PromptSpeaker: Speaker Generation Based on Text Descriptions

Add code
Oct 08, 2023
Figure 1 for PromptSpeaker: Speaker Generation Based on Text Descriptions
Figure 2 for PromptSpeaker: Speaker Generation Based on Text Descriptions
Figure 3 for PromptSpeaker: Speaker Generation Based on Text Descriptions
Figure 4 for PromptSpeaker: Speaker Generation Based on Text Descriptions
Viaarxiv icon

METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer

Add code
Jul 29, 2023
Viaarxiv icon

The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task

Add code
Jul 10, 2023
Figure 1 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 2 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 3 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Figure 4 for The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task
Viaarxiv icon

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions

Add code
Jun 01, 2023
Viaarxiv icon

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling

Add code
Nov 19, 2022
Viaarxiv icon

VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer

Add code
Nov 05, 2022
Viaarxiv icon

Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS

Add code
Nov 02, 2022
Viaarxiv icon

DSPGAN: a GAN-based universal vocoder for high-fidelity TTS by time-frequency domain supervision from DSP

Add code
Nov 02, 2022
Viaarxiv icon