Picture for Jaehyeon Kim

Jaehyeon Kim

Efficient Generative Modeling with Residual Vector Quantization-Based Tokens

Add code
Dec 13, 2024
Viaarxiv icon

Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings

Add code
Jul 29, 2024
Figure 1 for Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings
Figure 2 for Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings
Figure 3 for Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings
Figure 4 for Practical and Reproducible Symbolic Music Generation by Large Language Models with Structural Embeddings
Viaarxiv icon

DiTTo-TTS: Efficient and Scalable Zero-Shot Text-to-Speech with Diffusion Transformer

Add code
Jun 17, 2024
Viaarxiv icon

CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech

Add code
Apr 03, 2024
Viaarxiv icon

Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthesis

Add code
Jul 16, 2022
Figure 1 for Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthesis
Figure 2 for Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthesis
Figure 3 for Progressive Deblurring of Diffusion Models for Coarse-to-Fine Image Synthesis
Viaarxiv icon

Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Add code
Jun 11, 2021
Figure 1 for Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Figure 2 for Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Figure 3 for Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Figure 4 for Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Viaarxiv icon

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Add code
Oct 23, 2020
Figure 1 for HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Figure 2 for HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Figure 3 for HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Figure 4 for HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Viaarxiv icon