Picture for Tong Zhang

Tong Zhang

Nanjing University of Science and Technology, Nanjing, China

Earth-OneVision: Extending Remote Sensing Multimodal Large Language Models to More Sensor Modalities and Tasks

Add code
Jun 09, 2026
Viaarxiv icon

AsyncWebRL: Efficient Multi-Step RL for Visual Web Agents

Add code
Jun 04, 2026
Viaarxiv icon

P$^2$-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization

Add code
Jun 03, 2026
Viaarxiv icon

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

Add code
Jun 02, 2026
Viaarxiv icon

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

Add code
Jun 01, 2026
Viaarxiv icon

Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization

Add code
Jun 01, 2026
Viaarxiv icon

Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

Add code
May 28, 2026
Viaarxiv icon

Learning to Label: A Reinforced Self-Evolving Framework for Semi-supervised Referring Expression Segmentation

Add code
May 27, 2026
Viaarxiv icon

PRO-CUA: Process-Reward Optimization for Computer Use Agents

Add code
May 27, 2026
Viaarxiv icon

Channel-wise Vector Quantization

Add code
May 25, 2026
Viaarxiv icon