Picture for Sangmin Lee

Sangmin Lee

Data Intelligence Laboratory, LG AI Research

MuCo: Multi-turn Contrastive Learning for Multimodal Embedding Model

Add code
Feb 06, 2026
Viaarxiv icon

UniverSR: Unified and Versatile Audio Super-Resolution via Vocoder-Free Flow Matching

Add code
Oct 01, 2025
Viaarxiv icon

SAGE-LD: Towards Scalable and Generalizable End-to-End Language Diarization via Simulated Data Augmentation

Add code
Oct 01, 2025
Viaarxiv icon

UniCoM: A Universal Code-Switching Speech Generator

Add code
Aug 21, 2025
Viaarxiv icon

MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization

Add code
Jul 28, 2025
Viaarxiv icon

Thunder-Tok: Minimizing Tokens per Word in Tokenizing Korean Texts for Generative Language Models

Add code
Jun 18, 2025
Viaarxiv icon

Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training

Add code
May 27, 2025
Figure 1 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 2 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 3 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Figure 4 for Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training
Viaarxiv icon

NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results

Add code
May 17, 2025
Figure 1 for NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results
Figure 2 for NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results
Figure 3 for NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results
Figure 4 for NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results
Viaarxiv icon

SocialGesture: Delving into Multi-person Gesture Understanding

Add code
Apr 03, 2025
Viaarxiv icon

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Add code
Mar 07, 2025
Figure 1 for Question-Aware Gaussian Experts for Audio-Visual Question Answering
Figure 2 for Question-Aware Gaussian Experts for Audio-Visual Question Answering
Figure 3 for Question-Aware Gaussian Experts for Audio-Visual Question Answering
Figure 4 for Question-Aware Gaussian Experts for Audio-Visual Question Answering
Viaarxiv icon