Picture for Zhihang Zhong

Zhihang Zhong

Moment-Video: Diagnosing Temporal Fidelity of Video MLLMs on Momentary Visual Events

Add code
Jun 01, 2026
Viaarxiv icon

PhotoFlow: Agentic 3D Virtual Photography Missions

Add code
May 22, 2026
Viaarxiv icon

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

Add code
May 21, 2026
Viaarxiv icon

Perceptual Flow Network for Visually Grounded Reasoning

Add code
May 04, 2026
Viaarxiv icon

COSMO-Agent: Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration

Add code
Apr 07, 2026
Viaarxiv icon

GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Add code
Mar 12, 2026
Viaarxiv icon

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Add code
Mar 10, 2026
Viaarxiv icon

Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

Add code
Mar 10, 2026
Viaarxiv icon

Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

Add code
Mar 08, 2026
Viaarxiv icon

RISE-Video: Can Video Generators Decode Implicit World Rules?

Add code
Feb 05, 2026
Viaarxiv icon