Picture for Won Jang

Won Jang

Intelli-Z: Toward Intelligible Zero-Shot TTS

Add code
Jan 25, 2024
Viaarxiv icon

FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs

Add code
May 18, 2023
Figure 1 for FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Figure 2 for FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Figure 3 for FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Viaarxiv icon

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Add code
Jun 15, 2021
Figure 1 for UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Figure 2 for UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Figure 3 for UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Figure 4 for UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
Viaarxiv icon

Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains

Add code
Nov 19, 2020
Figure 1 for Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Figure 2 for Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Figure 3 for Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Figure 4 for Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains
Viaarxiv icon

JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment

Add code
May 15, 2020
Figure 1 for JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
Figure 2 for JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
Figure 3 for JDI-T: Jointly trained Duration Informed Transformer for Text-To-Speech without Explicit Alignment
Viaarxiv icon