Picture for Haochen Wang

Haochen Wang

DRIVE: Distributional and Retrieval-Augmented Bidding with Value Evaluation

Add code
Jun 12, 2026
Viaarxiv icon

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

Add code
Jun 05, 2026
Viaarxiv icon

ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison

Add code
May 19, 2026
Viaarxiv icon

VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification

Add code
Apr 02, 2026
Viaarxiv icon

Self-Evolving Recommendation System: End-To-End Autonomous Model Optimization With LLM Agents

Add code
Feb 10, 2026
Viaarxiv icon

U-Net Based Image Enhancement for Short-time Muon Scattering Tomography

Add code
Feb 05, 2026
Viaarxiv icon

SAMTok: Representing Any Mask with Two Words

Add code
Jan 22, 2026
Viaarxiv icon

MMFormalizer: Multimodal Autoformalization in the Wild

Add code
Jan 06, 2026
Viaarxiv icon

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Add code
Nov 18, 2025
Viaarxiv icon

CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models

Add code
Nov 15, 2025
Viaarxiv icon