Picture for Srikar Appalaraju

Srikar Appalaraju

DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models

Add code
Oct 04, 2024
Viaarxiv icon

VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding

Add code
Jul 17, 2024
Figure 1 for VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Figure 2 for VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Figure 3 for VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Figure 4 for VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
Viaarxiv icon

RAVEN: Multitask Retrieval Augmented Vision-Language Learning

Add code
Jun 27, 2024
Figure 1 for RAVEN: Multitask Retrieval Augmented Vision-Language Learning
Figure 2 for RAVEN: Multitask Retrieval Augmented Vision-Language Learning
Figure 3 for RAVEN: Multitask Retrieval Augmented Vision-Language Learning
Figure 4 for RAVEN: Multitask Retrieval Augmented Vision-Language Learning
Viaarxiv icon

Enhancing Vision-Language Pre-training with Rich Supervisions

Add code
Mar 05, 2024
Figure 1 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 2 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 3 for Enhancing Vision-Language Pre-training with Rich Supervisions
Figure 4 for Enhancing Vision-Language Pre-training with Rich Supervisions
Viaarxiv icon

DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models

Add code
Nov 15, 2023
Viaarxiv icon

Multiple-Question Multiple-Answer Text-VQA

Add code
Nov 15, 2023
Viaarxiv icon

A Multi-Modal Multilingual Benchmark for Document Image Classification

Add code
Oct 25, 2023
Viaarxiv icon

DocFormerv2: Local Features for Document Understanding

Add code
Jun 02, 2023
Figure 1 for DocFormerv2: Local Features for Document Understanding
Figure 2 for DocFormerv2: Local Features for Document Understanding
Figure 3 for DocFormerv2: Local Features for Document Understanding
Figure 4 for DocFormerv2: Local Features for Document Understanding
Viaarxiv icon

SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation

Add code
Feb 07, 2023
Figure 1 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 2 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 3 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Figure 4 for SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Viaarxiv icon

YORO -- Lightweight End to End Visual Grounding

Add code
Nov 15, 2022
Viaarxiv icon