Picture for Aviad Aberdam

Aviad Aberdam

DocVLM: Make Your VLM an Efficient Reader

Add code
Dec 11, 2024
Viaarxiv icon

TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models

Add code
Nov 07, 2024
Figure 1 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Figure 2 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Figure 3 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Figure 4 for TAP-VL: Text Layout-Aware Pre-training for Enriched Vision-Language Models
Viaarxiv icon

Question Aware Vision Transformer for Multimodal Reasoning

Add code
Feb 08, 2024
Figure 1 for Question Aware Vision Transformer for Multimodal Reasoning
Figure 2 for Question Aware Vision Transformer for Multimodal Reasoning
Figure 3 for Question Aware Vision Transformer for Multimodal Reasoning
Figure 4 for Question Aware Vision Transformer for Multimodal Reasoning
Viaarxiv icon

GRAM: Global Reasoning for Multi-Page VQA

Add code
Jan 07, 2024
Figure 1 for GRAM: Global Reasoning for Multi-Page VQA
Figure 2 for GRAM: Global Reasoning for Multi-Page VQA
Figure 3 for GRAM: Global Reasoning for Multi-Page VQA
Figure 4 for GRAM: Global Reasoning for Multi-Page VQA
Viaarxiv icon

CLIPTER: Looking at the Bigger Picture in Scene Text Recognition

Add code
Jan 18, 2023
Figure 1 for CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
Figure 2 for CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
Figure 3 for CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
Figure 4 for CLIPTER: Looking at the Bigger Picture in Scene Text Recognition
Viaarxiv icon

Towards Models that Can See and Read

Add code
Jan 18, 2023
Figure 1 for Towards Models that Can See and Read
Figure 2 for Towards Models that Can See and Read
Figure 3 for Towards Models that Can See and Read
Figure 4 for Towards Models that Can See and Read
Viaarxiv icon

Out-of-Vocabulary Challenge Report

Add code
Sep 14, 2022
Figure 1 for Out-of-Vocabulary Challenge Report
Figure 2 for Out-of-Vocabulary Challenge Report
Figure 3 for Out-of-Vocabulary Challenge Report
Figure 4 for Out-of-Vocabulary Challenge Report
Viaarxiv icon

Multimodal Semi-Supervised Learning for Text Recognition

Add code
May 08, 2022
Figure 1 for Multimodal Semi-Supervised Learning for Text Recognition
Figure 2 for Multimodal Semi-Supervised Learning for Text Recognition
Figure 3 for Multimodal Semi-Supervised Learning for Text Recognition
Figure 4 for Multimodal Semi-Supervised Learning for Text Recognition
Viaarxiv icon

On Calibration of Scene-Text Recognition Models

Add code
Dec 23, 2020
Figure 1 for On Calibration of Scene-Text Recognition Models
Figure 2 for On Calibration of Scene-Text Recognition Models
Figure 3 for On Calibration of Scene-Text Recognition Models
Figure 4 for On Calibration of Scene-Text Recognition Models
Viaarxiv icon

Sequence-to-Sequence Contrastive Learning for Text Recognition

Add code
Dec 20, 2020
Figure 1 for Sequence-to-Sequence Contrastive Learning for Text Recognition
Figure 2 for Sequence-to-Sequence Contrastive Learning for Text Recognition
Figure 3 for Sequence-to-Sequence Contrastive Learning for Text Recognition
Figure 4 for Sequence-to-Sequence Contrastive Learning for Text Recognition
Viaarxiv icon