Picture for Junhyeok Kim

Junhyeok Kim

GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance

Add code
Mar 17, 2025
Viaarxiv icon

Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding

Add code
Mar 08, 2025
Viaarxiv icon

See What You Are Told: Visual Attention Sink in Large Multimodal Models

Add code
Mar 05, 2025
Viaarxiv icon

WoLF: Wide-scope Large Language Model Framework for CXR Understanding

Add code
Mar 29, 2024
Viaarxiv icon

Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms

Add code
Oct 16, 2023
Figure 1 for Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
Figure 2 for Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
Figure 3 for Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
Figure 4 for Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
Viaarxiv icon