Picture for Gunu Jho

Gunu Jho

Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling

Add code
Sep 13, 2024
Figure 1 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Figure 2 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Figure 3 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Figure 4 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Viaarxiv icon

Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification

Add code
Apr 02, 2024
Viaarxiv icon

Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations

Add code
Feb 02, 2024
Viaarxiv icon

Fine-grained Noise Control for Multispeaker Speech Synthesis

Add code
Apr 11, 2022
Figure 1 for Fine-grained Noise Control for Multispeaker Speech Synthesis
Figure 2 for Fine-grained Noise Control for Multispeaker Speech Synthesis
Figure 3 for Fine-grained Noise Control for Multispeaker Speech Synthesis
Figure 4 for Fine-grained Noise Control for Multispeaker Speech Synthesis
Viaarxiv icon

Karaoker: Alignment-free singing voice synthesis with speech training data

Add code
Apr 08, 2022
Figure 1 for Karaoker: Alignment-free singing voice synthesis with speech training data
Figure 2 for Karaoker: Alignment-free singing voice synthesis with speech training data
Figure 3 for Karaoker: Alignment-free singing voice synthesis with speech training data
Viaarxiv icon

Self supervised learning for robust voice cloning

Add code
Apr 07, 2022
Figure 1 for Self supervised learning for robust voice cloning
Figure 2 for Self supervised learning for robust voice cloning
Figure 3 for Self supervised learning for robust voice cloning
Viaarxiv icon

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis

Add code
Apr 06, 2022
Figure 1 for SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Figure 2 for SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Figure 3 for SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Figure 4 for SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis
Viaarxiv icon