Document Layout Analysis


Document layout analysis (DLA) is the process of analyzing a document's spatial arrangement of content to understand its structure and layout. This includes identifying the location of text, tables, images, and other elements as well as the overall structure, such as headings and subheadings. DLA helps in extracting and categorizing information and automating document processing workflows.

HAND: Hierarchical Attention Network for Multi-Scale Handwritten Document Recognition and Layout Analysis

Add code
Dec 25, 2024
Viaarxiv icon

LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating

Add code
Dec 24, 2024
Viaarxiv icon

DoPTA: Improving Document Layout Analysis using Patch-Text Alignment

Add code
Dec 17, 2024
Figure 1 for DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
Figure 2 for DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
Figure 3 for DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
Figure 4 for DoPTA: Improving Document Layout Analysis using Patch-Text Alignment
Viaarxiv icon

Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models

Add code
Dec 18, 2024
Viaarxiv icon

SAIL: Sample-Centric In-Context Learning for Document Information Extraction

Add code
Dec 22, 2024
Viaarxiv icon

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Add code
Dec 10, 2024
Figure 1 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 2 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 3 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 4 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Viaarxiv icon

Diachronic Document Dataset for Semantic Layout Analysis

Add code
Nov 15, 2024
Figure 1 for Diachronic Document Dataset for Semantic Layout Analysis
Figure 2 for Diachronic Document Dataset for Semantic Layout Analysis
Figure 3 for Diachronic Document Dataset for Semantic Layout Analysis
Figure 4 for Diachronic Document Dataset for Semantic Layout Analysis
Viaarxiv icon

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Add code
Oct 16, 2024
Figure 1 for DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Figure 2 for DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Figure 3 for DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Figure 4 for DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Viaarxiv icon

MinerU: An Open-Source Solution for Precise Document Content Extraction

Add code
Sep 27, 2024
Figure 1 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 2 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 3 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 4 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Viaarxiv icon

VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

Add code
Oct 14, 2024
Figure 1 for VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Figure 2 for VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Figure 3 for VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Figure 4 for VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Viaarxiv icon