Picture for Eunwoo Song

Eunwoo Song

Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems

Add code
Sep 04, 2024
Figure 1 for Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems
Figure 2 for Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems
Figure 3 for Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems
Figure 4 for Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems
Viaarxiv icon

Unified Speech-Text Pretraining for Spoken Dialog Modeling

Add code
Feb 08, 2024
Figure 1 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Figure 2 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Figure 3 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Figure 4 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Viaarxiv icon

Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech

Add code
Aug 28, 2023
Viaarxiv icon

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

Add code
Oct 28, 2022
Figure 1 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 2 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 3 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Figure 4 for Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis
Viaarxiv icon

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

Add code
Jul 01, 2022
Figure 1 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 2 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 3 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Figure 4 for Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems
Viaarxiv icon

TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder

Add code
Jun 30, 2022
Figure 1 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 2 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 3 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Figure 4 for TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder
Viaarxiv icon

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Add code
Apr 21, 2022
Figure 1 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 2 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 3 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Figure 4 for Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Viaarxiv icon

Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss

Add code
Jan 19, 2021
Figure 1 for Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Figure 2 for Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Figure 3 for Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Figure 4 for Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss
Viaarxiv icon

Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators

Add code
Oct 27, 2020
Figure 1 for Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Figure 2 for Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Figure 3 for Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Figure 4 for Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators
Viaarxiv icon

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

Add code
Oct 25, 2019
Figure 1 for Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Figure 2 for Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Figure 3 for Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Figure 4 for Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram
Viaarxiv icon