Picture for Geewook Kim

Geewook Kim

How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?

Add code
Oct 10, 2024
Viaarxiv icon

On Efficient Language and Vision Assistants for Visually-Situated Natural Language Understanding: What Matters in Reading and Reasoning

Add code
Jun 17, 2024
Viaarxiv icon

CREPE: Coordinate-Aware End-to-End Document Parser

Add code
May 01, 2024
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation

Add code
Jan 12, 2024
Viaarxiv icon

SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap

Add code
Sep 21, 2023
Viaarxiv icon

Cream: Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models

Add code
May 24, 2023
Viaarxiv icon

Technical Report on Web-based Visual Corpus Construction for Visual Document Understanding

Add code
Nov 07, 2022
Viaarxiv icon

Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching

Add code
Feb 23, 2022
Figure 1 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Figure 2 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Figure 3 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Figure 4 for Semi-Structured Query Grounding for Document-Oriented Databases with Deep Retrieval and Its Application to Receipt and POI Matching
Viaarxiv icon

Donut: Document Understanding Transformer without OCR

Add code
Nov 30, 2021
Figure 1 for Donut: Document Understanding Transformer without OCR
Figure 2 for Donut: Document Understanding Transformer without OCR
Figure 3 for Donut: Document Understanding Transformer without OCR
Figure 4 for Donut: Document Understanding Transformer without OCR
Viaarxiv icon