Picture for Hongjie Zhang

Hongjie Zhang

PhotoFlow: Agentic 3D Virtual Photography Missions

Add code
May 22, 2026
Viaarxiv icon

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

Add code
May 21, 2026
Viaarxiv icon

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Add code
Mar 26, 2026
Viaarxiv icon

ScaleEdit-12M: Scaling Open-Source Image Editing Data Generation via Multi-Agent Framework

Add code
Mar 21, 2026
Viaarxiv icon

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning

Add code
Mar 17, 2026
Viaarxiv icon

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Add code
Mar 10, 2026
Viaarxiv icon

Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

Add code
Mar 08, 2026
Viaarxiv icon

Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning

Add code
Jan 30, 2026
Viaarxiv icon

MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics

Add code
Oct 14, 2025
Figure 1 for MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics
Figure 2 for MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics
Figure 3 for MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics
Figure 4 for MTOS: A LLM-Driven Multi-topic Opinion Simulation Framework for Exploring Echo Chamber Dynamics
Viaarxiv icon

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Figure 1 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 2 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 3 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Figure 4 for InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Viaarxiv icon