Picture for Linke Ouyang

Linke Ouyang

MinerU: An Open-Source Solution for Precise Document Content Extraction

Add code
Sep 27, 2024
Figure 1 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 2 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 3 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Figure 4 for MinerU: An Open-Source Solution for Precise Document Content Extraction
Viaarxiv icon

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation

Add code
Sep 05, 2024
Figure 1 for CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
Figure 2 for CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
Figure 3 for CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
Figure 4 for CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation
Viaarxiv icon

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Add code
Jul 03, 2024
Figure 1 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 2 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 3 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Figure 4 for InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Viaarxiv icon

DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data

Add code
May 28, 2024
Viaarxiv icon

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Add code
Apr 09, 2024
Figure 1 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Figure 2 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Figure 3 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Figure 4 for InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD
Viaarxiv icon

InternLM2 Technical Report

Add code
Mar 26, 2024
Viaarxiv icon

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model

Add code
Jan 29, 2024
Viaarxiv icon

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization

Add code
Nov 28, 2023
Viaarxiv icon

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition

Add code
Sep 29, 2023
Viaarxiv icon

MLLM-DataEngine: An Iterative Refinement Approach for MLLM

Add code
Sep 11, 2023
Viaarxiv icon