Picture for Zichen Wen

Zichen Wen

D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning

Add code
Dec 22, 2025
Figure 1 for D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
Figure 2 for D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
Figure 3 for D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
Figure 4 for D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
Viaarxiv icon

IPCV: Information-Preserving Compression for MLLM Visual Encoders

Add code
Dec 21, 2025
Figure 1 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Figure 2 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Figure 3 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Figure 4 for IPCV: Information-Preserving Compression for MLLM Visual Encoders
Viaarxiv icon

DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM

Add code
Dec 11, 2025
Figure 1 for DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
Figure 2 for DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
Figure 3 for DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
Figure 4 for DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM
Viaarxiv icon

OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation

Add code
Oct 30, 2025
Viaarxiv icon

AI for Service: Proactive Assistance with AI Glasses

Add code
Oct 16, 2025
Viaarxiv icon

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

Add code
Oct 14, 2025
Figure 1 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Figure 2 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Figure 3 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Figure 4 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Viaarxiv icon

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

Add code
Oct 08, 2025
Viaarxiv icon

AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs

Add code
Oct 08, 2025
Figure 1 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 2 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 3 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 4 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Viaarxiv icon

DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation

Add code
Aug 06, 2025
Figure 1 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Figure 2 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Figure 3 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Figure 4 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Viaarxiv icon

TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration

Add code
Jun 11, 2025
Figure 1 for TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration
Figure 2 for TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration
Figure 3 for TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration
Figure 4 for TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration
Viaarxiv icon