Picture for Ming-Hsuan Yang

Ming-Hsuan Yang

DrivingAgent: Design and Scheduling Agents for Autonomous Driving Systems

Add code
Jun 11, 2026
Viaarxiv icon

H2HMem: A Multimodal Memory Benchmark for Agents in Human-Human Interactions

Add code
Jun 08, 2026
Viaarxiv icon

UniSHARP: Universal Sharp Monocular View Synthesis

Add code
Jun 05, 2026
Viaarxiv icon

Reasmory: 3D Reconstruction as Explicit Memory for VLMs Spatial Reasoning

Add code
May 31, 2026
Viaarxiv icon

CV-Arena: An Open Benchmark for Instructional Computer Vision Problem Solving with Human-AI Collaborative Preferences

Add code
May 30, 2026
Viaarxiv icon

MotiMotion: Motion-Controlled Video Generation with Visual Reasoning

Add code
May 21, 2026
Viaarxiv icon

GeoWeaver: Grounding Visual Tokens with Geometric Evidence before Scene Reasoning

Add code
May 21, 2026
Viaarxiv icon

SAMOFT: Robust Multi-Object Tracking via Region and Flow

Add code
May 10, 2026
Viaarxiv icon

AlbumFill: Album-Guided Reasoning and Retrieval for Personalized Image Completion

Add code
May 04, 2026
Viaarxiv icon

Evolution of Video Generative Foundations

Add code
Apr 07, 2026
Viaarxiv icon