Picture for Zhibo Yang

Zhibo Yang

HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction

Add code
Nov 02, 2024
Viaarxiv icon

VL-Reader: Vision and Language Reconstructor is an Effective Scene Text Recognizer

Add code
Sep 18, 2024
Viaarxiv icon

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

Add code
Aug 27, 2024
Viaarxiv icon

Look Hear: Gaze Prediction for Speech-directed Human Attention

Add code
Jul 28, 2024
Figure 1 for Look Hear: Gaze Prediction for Speech-directed Human Attention
Figure 2 for Look Hear: Gaze Prediction for Speech-directed Human Attention
Figure 3 for Look Hear: Gaze Prediction for Speech-directed Human Attention
Figure 4 for Look Hear: Gaze Prediction for Speech-directed Human Attention
Viaarxiv icon

Visual Text Generation in the Wild

Add code
Jul 19, 2024
Viaarxiv icon

OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition

Add code
Mar 28, 2024
Viaarxiv icon

HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition

Add code
Mar 20, 2024
Viaarxiv icon

LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

Add code
Jan 03, 2024
Viaarxiv icon

Efficient Monaural Speech Enhancement using Spectrum Attention Fusion

Add code
Aug 04, 2023
Viaarxiv icon

Predicting Human Attention using Computational Attention

Add code
Apr 04, 2023
Viaarxiv icon