Picture for Jianjian Sun

Jianjian Sun

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Add code
Sep 03, 2024
Viaarxiv icon

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
May 23, 2024
Viaarxiv icon

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Add code
Apr 15, 2024
Viaarxiv icon

Small Language Model Meets with Reinforced Vision Vocabulary

Add code
Jan 23, 2024
Viaarxiv icon

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Add code
Dec 11, 2023
Viaarxiv icon

DreamLLM: Synergistic Multimodal Comprehension and Creation

Add code
Sep 20, 2023
Viaarxiv icon

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning

Add code
Jul 18, 2023
Viaarxiv icon

The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge

Add code
Jun 16, 2023
Viaarxiv icon

BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo

Add code
Apr 09, 2023
Viaarxiv icon

Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception

Add code
Mar 13, 2023
Viaarxiv icon