Picture for Yun Li

Yun Li

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding

Add code
Mar 18, 2025
Viaarxiv icon

Is LLMs Hallucination Usable? LLM-based Negative Reasoning for Fake News Detection

Add code
Mar 12, 2025
Viaarxiv icon

Better Process Supervision with Bi-directional Rewarding Signals

Add code
Mar 06, 2025
Viaarxiv icon

Investigating the Adaptive Robustness with Knowledge Conflicts in LLM-based Multi-Agent Systems

Add code
Feb 21, 2025
Viaarxiv icon

Sce2DriveX: A Generalized MLLM Framework for Scene-to-Drive Learning

Add code
Feb 19, 2025
Viaarxiv icon

CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing

Add code
Feb 04, 2025
Viaarxiv icon

Exploring the Role of Explicit Temporal Modeling in Multimodal Large Language Models for Video Understanding

Add code
Jan 28, 2025
Viaarxiv icon

VisionLLM-based Multimodal Fusion Network for Glottic Carcinoma Early Detection

Add code
Dec 24, 2024
Viaarxiv icon

Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective

Add code
Dec 23, 2024
Figure 1 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Figure 2 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Figure 3 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Figure 4 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Viaarxiv icon

Compositional Zero-Shot Learning with Contextualized Cues and Adaptive Contrastive Training

Add code
Dec 10, 2024
Viaarxiv icon