Picture for Yu Zeng

Yu Zeng

Corresponding author

NüshuVoice: Reviving the Voice of Endangered Nüshu with Pitch-Aware Text-to-Speech

Add code
Jun 08, 2026
Viaarxiv icon

ACC: Compiling Agent Trajectories for Long-Context Training

Add code
May 21, 2026
Viaarxiv icon

VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph

Add code
Feb 13, 2026
Viaarxiv icon

Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models

Add code
Feb 10, 2026
Viaarxiv icon

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Add code
Feb 02, 2026
Viaarxiv icon

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Add code
Jan 29, 2026
Viaarxiv icon

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Add code
Jan 08, 2026
Viaarxiv icon

"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models

Add code
Jul 17, 2025
Viaarxiv icon

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Add code
May 28, 2025
Viaarxiv icon

Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation

Add code
May 05, 2025
Viaarxiv icon