Picture for Inchul Hwang

Inchul Hwang

Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling

Add code
Sep 13, 2024
Figure 1 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Figure 2 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Figure 3 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Figure 4 for Investigating Disentanglement in a Phoneme-level Speech Codec for Prosody Modeling
Viaarxiv icon

Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification

Add code
Apr 02, 2024
Figure 1 for Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification
Figure 2 for Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification
Figure 3 for Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification
Figure 4 for Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification
Viaarxiv icon

Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations

Add code
Feb 02, 2024
Figure 1 for Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
Figure 2 for Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
Figure 3 for Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
Figure 4 for Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
Viaarxiv icon

Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis

Add code
Nov 02, 2022
Figure 1 for Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis
Figure 2 for Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis
Figure 3 for Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis
Figure 4 for Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis
Viaarxiv icon

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis

Add code
Nov 01, 2022
Figure 1 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Figure 2 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Figure 3 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Figure 4 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Viaarxiv icon

Generating Gender-Ambiguous Text-to-Speech Voices

Add code
Nov 01, 2022
Figure 1 for Generating Gender-Ambiguous Text-to-Speech Voices
Figure 2 for Generating Gender-Ambiguous Text-to-Speech Voices
Figure 3 for Generating Gender-Ambiguous Text-to-Speech Voices
Figure 4 for Generating Gender-Ambiguous Text-to-Speech Voices
Viaarxiv icon

Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features

Add code
Nov 01, 2022
Figure 1 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Figure 2 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Figure 3 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Figure 4 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Viaarxiv icon

Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation

Add code
Oct 31, 2022
Figure 1 for Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
Figure 2 for Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
Figure 3 for Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
Figure 4 for Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
Viaarxiv icon

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation

Add code
Dec 29, 2020
Figure 1 for Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation
Figure 2 for Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation
Figure 3 for Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation
Figure 4 for Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation
Viaarxiv icon

Ensemble-Based Deep Reinforcement Learning for Chatbots

Add code
Aug 27, 2019
Figure 1 for Ensemble-Based Deep Reinforcement Learning for Chatbots
Figure 2 for Ensemble-Based Deep Reinforcement Learning for Chatbots
Figure 3 for Ensemble-Based Deep Reinforcement Learning for Chatbots
Figure 4 for Ensemble-Based Deep Reinforcement Learning for Chatbots
Viaarxiv icon