Picture for Jiwan Chung

Jiwan Chung

Global Geometry Is Not Enough for Vision Representations

Add code
Feb 03, 2026
Viaarxiv icon

What MLLMs Learn about When they Learn about Multimodal Reasoning: Perception, Reasoning, or their Integration?

Add code
Oct 02, 2025
Viaarxiv icon

Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists?

Add code
May 30, 2025
Viaarxiv icon

Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation

Add code
May 24, 2025
Viaarxiv icon

Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation

Add code
Apr 07, 2025
Figure 1 for Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation
Figure 2 for Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation
Figure 3 for Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation
Figure 4 for Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation
Viaarxiv icon

VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms

Add code
Mar 18, 2025
Viaarxiv icon

GuideDog: A Real-World Egocentric Multimodal Dataset for Blind and Low-Vision Accessibility-Aware Guidance

Add code
Mar 17, 2025
Viaarxiv icon

Teaching Metric Distance to Autoregressive Multimodal Foundational Models

Add code
Mar 04, 2025
Viaarxiv icon

MASS: Overcoming Language Bias in Image-Text Matching

Add code
Jan 20, 2025
Viaarxiv icon

SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

Add code
Jan 16, 2025
Viaarxiv icon