Picture for Chunrui Han

Chunrui Han

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Add code
Sep 03, 2024
Viaarxiv icon

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Add code
Jun 24, 2024
Viaarxiv icon

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
May 23, 2024
Viaarxiv icon

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Add code
Apr 15, 2024
Viaarxiv icon

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

Add code
Mar 06, 2024
Viaarxiv icon

Small Language Model Meets with Reinforced Vision Vocabulary

Add code
Jan 23, 2024
Viaarxiv icon

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Dec 11, 2023
Viaarxiv icon

DreamLLM: Synergistic Multimodal Comprehension and Creation

Add code
Sep 20, 2023
Viaarxiv icon

SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers

Add code
Aug 14, 2023
Viaarxiv icon

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

Add code
Jul 18, 2023
Viaarxiv icon