Picture for Geewook Kim

Geewook Kim

Context-Informed Grounding Supervision

Add code
Jun 18, 2025
Viaarxiv icon

MambaMia: A State-Space-Model-Based Compression for Efficient Video Understanding in Large Multimodal Models

Add code
Jun 16, 2025
Viaarxiv icon

MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models

Add code
Jun 05, 2025
Viaarxiv icon

Evaluating Multimodal Generative AI with Korean Educational Standards

Add code
Feb 21, 2025
Viaarxiv icon

How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?

Add code
Oct 10, 2024
Figure 1 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 2 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 3 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Figure 4 for How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?
Viaarxiv icon

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

Add code
Jun 17, 2024
Viaarxiv icon

CREPE: Coordinate-Aware End-to-End Document Parser

Add code
May 01, 2024
Figure 1 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 2 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 3 for CREPE: Coordinate-Aware End-to-End Document Parser
Figure 4 for CREPE: Coordinate-Aware End-to-End Document Parser
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation

Add code
Jan 12, 2024
Viaarxiv icon

SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap

Add code
Sep 21, 2023
Viaarxiv icon