Picture for Bin Wang

Bin Wang

Southeast University, China

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training

Add code
Dec 16, 2024
Viaarxiv icon

CoinMath: Harnessing the Power of Coding Instruction for Math LLMs

Add code
Dec 16, 2024
Viaarxiv icon

MERaLiON-AudioLLM: Technical Report

Add code
Dec 13, 2024
Viaarxiv icon

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Viaarxiv icon

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Add code
Dec 10, 2024
Figure 1 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 2 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 3 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Figure 4 for OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations
Viaarxiv icon

Chimera: Improving Generalist Model with Domain-Specific Experts

Add code
Dec 08, 2024
Viaarxiv icon

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Add code
Dec 03, 2024
Viaarxiv icon

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding

Add code
Nov 25, 2024
Figure 1 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Figure 2 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Figure 3 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Figure 4 for Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Viaarxiv icon

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Add code
Oct 29, 2024
Figure 1 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 2 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 3 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Figure 4 for Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Viaarxiv icon

HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation

Add code
Oct 28, 2024
Figure 1 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 2 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 3 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Figure 4 for HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Viaarxiv icon