Picture for Siyang Qin

Siyang Qin

PaliGemma 2: A Family of Versatile VLMs for Transfer

Add code
Dec 04, 2024
Viaarxiv icon

Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

Add code
Oct 17, 2024
Figure 1 for Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Figure 2 for Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Figure 3 for Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Figure 4 for Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Viaarxiv icon

Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis

Add code
Oct 25, 2023
Viaarxiv icon

ICDAR 2023 Competition on Hierarchical Text Detection and Recognition

Add code
May 16, 2023
Viaarxiv icon

FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction

Add code
May 04, 2023
Figure 1 for FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Figure 2 for FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Figure 3 for FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Figure 4 for FormNetV2: Multimodal Graph Contrastive Learning for Form Document Information Extraction
Viaarxiv icon

Towards End-to-End Unified Scene Text Detection and Layout Analysis

Add code
Mar 28, 2022
Figure 1 for Towards End-to-End Unified Scene Text Detection and Layout Analysis
Figure 2 for Towards End-to-End Unified Scene Text Detection and Layout Analysis
Figure 3 for Towards End-to-End Unified Scene Text Detection and Layout Analysis
Figure 4 for Towards End-to-End Unified Scene Text Detection and Layout Analysis
Viaarxiv icon

ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction

Add code
Jun 21, 2021
Figure 1 for ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction
Figure 2 for ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction
Figure 3 for ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction
Figure 4 for ROPE: Reading Order Equivariant Positional Encoding for Graph-based Document Information Extraction
Viaarxiv icon

Rethinking Text Line Recognition Models

Add code
Apr 21, 2021
Figure 1 for Rethinking Text Line Recognition Models
Figure 2 for Rethinking Text Line Recognition Models
Figure 3 for Rethinking Text Line Recognition Models
Figure 4 for Rethinking Text Line Recognition Models
Viaarxiv icon

Towards Unconstrained End-to-End Text Spotting

Add code
Aug 24, 2019
Figure 1 for Towards Unconstrained End-to-End Text Spotting
Figure 2 for Towards Unconstrained End-to-End Text Spotting
Figure 3 for Towards Unconstrained End-to-End Text Spotting
Figure 4 for Towards Unconstrained End-to-End Text Spotting
Viaarxiv icon

Automatic Semantic Content Removal by Learning to Neglect

Add code
Jul 20, 2018
Figure 1 for Automatic Semantic Content Removal by Learning to Neglect
Figure 2 for Automatic Semantic Content Removal by Learning to Neglect
Figure 3 for Automatic Semantic Content Removal by Learning to Neglect
Figure 4 for Automatic Semantic Content Removal by Learning to Neglect
Viaarxiv icon