Picture for Kim Sung-Bin

Kim Sung-Bin

Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment

Add code
Dec 09, 2024
Viaarxiv icon

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

Add code
Oct 23, 2024
Figure 1 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 2 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 3 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 4 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Viaarxiv icon

Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert

Add code
Jul 01, 2024
Viaarxiv icon

MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset

Add code
Jun 20, 2024
Viaarxiv icon

Revisiting Learning-based Video Motion Magnification for Real-time Processing

Add code
Mar 04, 2024
Viaarxiv icon

SMILE: Multimodal Dataset for Understanding Laughter in Video with Language Models

Add code
Dec 15, 2023
Viaarxiv icon

LaughTalk: Expressive 3D Talking Head Generation with Laughter

Add code
Nov 02, 2023
Figure 1 for LaughTalk: Expressive 3D Talking Head Generation with Laughter
Figure 2 for LaughTalk: Expressive 3D Talking Head Generation with Laughter
Figure 3 for LaughTalk: Expressive 3D Talking Head Generation with Laughter
Figure 4 for LaughTalk: Expressive 3D Talking Head Generation with Laughter
Viaarxiv icon

A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization

Add code
Oct 06, 2023
Viaarxiv icon

The Devil in the Details: Simple and Effective Optical Flow Synthetic Data Generation

Add code
Aug 14, 2023
Viaarxiv icon

Prefix tuning for automated audio captioning

Add code
Apr 04, 2023
Viaarxiv icon