Picture for Linfeng Zhang

Linfeng Zhang

Shanghai Jiaotong University

Inverse Knowledge Search over Verifiable Reasoning: Synthesizing a Scientific Encyclopedia from a Long Chains-of-Thought Knowledge Base

Add code
Oct 30, 2025
Viaarxiv icon

Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks

Add code
Oct 29, 2025
Viaarxiv icon

Mixing Importance with Diversity: Joint Optimization for KV Cache Compression in Large Vision-Language Models

Add code
Oct 23, 2025
Viaarxiv icon

AI for Service: Proactive Assistance with AI Glasses

Add code
Oct 16, 2025
Viaarxiv icon

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

Add code
Oct 14, 2025
Viaarxiv icon

AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs

Add code
Oct 08, 2025
Figure 1 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 2 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 3 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 4 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Viaarxiv icon

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

Add code
Oct 08, 2025
Viaarxiv icon

LMM-Incentive: Large Multimodal Model-based Incentive Design for User-Generated Content in Web 3.0

Add code
Oct 06, 2025
Viaarxiv icon

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era

Add code
Sep 16, 2025
Viaarxiv icon