Picture for Zhibo Yang

Zhibo Yang

Generative Compositor for Few-Shot Visual Information Extraction

Add code
Mar 21, 2025
Viaarxiv icon

Enhancing Deep Reinforcement Learning-based Robot Navigation Generalization through Scenario Augmentation

Add code
Mar 03, 2025
Viaarxiv icon

Beyond Visibility Limits: A DRL-Based Navigation Strategy for Unexpected Obstacles

Add code
Mar 03, 2025
Viaarxiv icon

OmniParser V2: Structured-Points-of-Thought for Unified Visual Text Parsing and Its Generality to Multimodal Large Language Models

Add code
Feb 22, 2025
Viaarxiv icon

Qwen2.5-VL Technical Report

Add code
Feb 19, 2025
Viaarxiv icon

SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild

Add code
Jan 07, 2025
Figure 1 for SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Figure 2 for SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Figure 3 for SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Figure 4 for SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild
Viaarxiv icon

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

Add code
Dec 03, 2024
Viaarxiv icon

HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction

Add code
Nov 02, 2024
Figure 1 for HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Figure 2 for HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Figure 3 for HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Figure 4 for HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction
Viaarxiv icon

VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer

Add code
Sep 18, 2024
Viaarxiv icon

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

Add code
Aug 27, 2024
Figure 1 for Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Figure 2 for Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Figure 3 for Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Figure 4 for Platypus: A Generalized Specialist Model for Reading Text in Various Forms
Viaarxiv icon