Picture for Guangzhi Li

Guangzhi Li

SECap: Speech Emotion Captioning with Large Language Model

Add code
Dec 23, 2023
Viaarxiv icon

Automatic Prosody Annotation with Pre-Trained Text-Speech Model

Add code
Jun 16, 2022
Figure 1 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Figure 2 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Figure 3 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Figure 4 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Viaarxiv icon

Controllable Context-aware Conversational Speech Synthesis

Add code
Jun 21, 2021
Figure 1 for Controllable Context-aware Conversational Speech Synthesis
Figure 2 for Controllable Context-aware Conversational Speech Synthesis
Figure 3 for Controllable Context-aware Conversational Speech Synthesis
Figure 4 for Controllable Context-aware Conversational Speech Synthesis
Viaarxiv icon

VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention

Add code
Feb 12, 2021
Figure 1 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 2 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 3 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 4 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Viaarxiv icon

Maximizing Mutual Information for Tacotron

Add code
Aug 30, 2019
Figure 1 for Maximizing Mutual Information for Tacotron
Figure 2 for Maximizing Mutual Information for Tacotron
Figure 3 for Maximizing Mutual Information for Tacotron
Figure 4 for Maximizing Mutual Information for Tacotron
Viaarxiv icon