Text Extraction From Documents


Text extraction from documents is the process of extracting text data from scanned documents or images.

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

Add code
Feb 18, 2025
Viaarxiv icon

RA-MTR: A Retrieval Augmented Multi-Task Reader based Approach for Inspirational Quote Extraction from Long Documents

Add code
Feb 17, 2025
Viaarxiv icon

KET-RAG: A Cost-Efficient Multi-Granular Indexing Framework for Graph-RAG

Add code
Feb 13, 2025
Viaarxiv icon

KARMA: Leveraging Multi-Agent LLMs for Automated Knowledge Graph Enrichment

Add code
Feb 10, 2025
Viaarxiv icon

Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents

Add code
Feb 06, 2025
Figure 1 for Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Figure 2 for Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Figure 3 for Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Figure 4 for Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents
Viaarxiv icon

Invizo: Arabic Handwritten Document Optical Character Recognition Solution

Add code
Feb 07, 2025
Viaarxiv icon

Overcoming Vision Language Model Challenges in Diagram Understanding: A Proof-of-Concept with XML-Driven Large Language Models Solutions

Add code
Feb 05, 2025
Viaarxiv icon

Investigating Corporate Social Responsibility Initiatives: Examining the case of corporate Covid-19 response

Add code
Feb 05, 2025
Viaarxiv icon

Efficient extraction of medication information from clinical notes: an evaluation in two languages

Add code
Feb 05, 2025
Figure 1 for Efficient extraction of medication information from clinical notes: an evaluation in two languages
Figure 2 for Efficient extraction of medication information from clinical notes: an evaluation in two languages
Figure 3 for Efficient extraction of medication information from clinical notes: an evaluation in two languages
Figure 4 for Efficient extraction of medication information from clinical notes: an evaluation in two languages
Viaarxiv icon

Context-Aware Hierarchical Merging for Long Document Summarization

Add code
Feb 03, 2025
Viaarxiv icon