Picture for Heeseung Kim

Heeseung Kim

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Add code
Nov 23, 2024
Viaarxiv icon

Style-Friendly SNR Sampler for Style-Driven Generation

Add code
Nov 22, 2024
Viaarxiv icon

VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance

Add code
Sep 24, 2024
Viaarxiv icon

NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers

Add code
Sep 24, 2024
Viaarxiv icon

VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech

Add code
Aug 27, 2024
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

Unified Speech-Text Pretraining for Spoken Dialog Modeling

Add code
Feb 08, 2024
Figure 1 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Figure 2 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Figure 3 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Figure 4 for Unified Speech-Text Pretraining for Spoken Dialog Modeling
Viaarxiv icon

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

Add code
Jun 28, 2023
Viaarxiv icon

Edit-A-Video: Single Video Editing with Object-Aware Consistency

Add code
Apr 01, 2023
Viaarxiv icon

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

Add code
May 30, 2022
Figure 1 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Figure 2 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Figure 3 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Figure 4 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Viaarxiv icon