Picture for Tae-Hyun Oh

Tae-Hyun Oh

POSTECH

Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics

Add code
Mar 27, 2025
Viaarxiv icon

FPGS: Feed-Forward Semantic-aware Photorealistic Style Transfer of Large-Scale Gaussian Splatting

Add code
Mar 11, 2025
Viaarxiv icon

Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration

Add code
Feb 23, 2025
Viaarxiv icon

Zero-shot Depth Completion via Test-time Alignment with Affine-invariant Depth Prior

Add code
Feb 10, 2025
Viaarxiv icon

The Devil is in the Details: Simple Remedies for Image-to-LiDAR Representation Learning

Add code
Jan 16, 2025
Viaarxiv icon

SoundBrush: Sound as a Brush for Visual Scene Editing

Add code
Dec 31, 2024
Figure 1 for SoundBrush: Sound as a Brush for Visual Scene Editing
Figure 2 for SoundBrush: Sound as a Brush for Visual Scene Editing
Figure 3 for SoundBrush: Sound as a Brush for Visual Scene Editing
Figure 4 for SoundBrush: Sound as a Brush for Visual Scene Editing
Viaarxiv icon

Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment

Add code
Dec 09, 2024
Figure 1 for Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Figure 2 for Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Figure 3 for Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Figure 4 for Sound2Vision: Generating Diverse Visuals from Audio through Cross-Modal Latent Alignment
Viaarxiv icon

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Add code
Dec 02, 2024
Viaarxiv icon

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

Add code
Oct 23, 2024
Figure 1 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 2 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 3 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Figure 4 for AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models
Viaarxiv icon

MeTTA: Single-View to 3D Textured Mesh Reconstruction with Test-Time Adaptation

Add code
Aug 21, 2024
Viaarxiv icon