Picture for Hyun-Wook Yoon

Hyun-Wook Yoon

Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech

Add code
Aug 28, 2023
Viaarxiv icon

Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model

Add code
Jun 05, 2023
Viaarxiv icon

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

Add code
Jul 01, 2022
Figure 1 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 2 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 3 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 4 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Viaarxiv icon

TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder

Add code
Jun 30, 2022
Figure 1 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 2 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 3 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 4 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Viaarxiv icon

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Add code
Apr 21, 2022
Figure 1 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 2 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 3 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 4 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Viaarxiv icon

Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder

Add code
Aug 16, 2020
Figure 1 for Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Figure 2 for Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Figure 3 for Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Figure 4 for Audio Dequantization for High Fidelity Audio Generation in Flow-based Neural Vocoder
Viaarxiv icon