Picture for Yuhao Wang

Yuhao Wang

Jack

Cross-Modal Coreference Alignment: Enabling Reliable Information Transfer in Omni-LLMs

Add code
Apr 07, 2026
Viaarxiv icon

A Flow Matching Framework for Soft-Robot Inverse Dynamics

Add code
Apr 03, 2026
Viaarxiv icon

VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification

Add code
Apr 02, 2026
Viaarxiv icon

HFP-SAM: Hierarchical Frequency Prompted SAM for Efficient Marine Animal Segmentation

Add code
Mar 13, 2026
Viaarxiv icon

Emulating Clinician Cognition via Self-Evolving Deep Clinical Research

Add code
Mar 11, 2026
Viaarxiv icon

RAGTrack: Language-aware RGBT Tracking with Retrieval-Augmented Generation

Add code
Mar 04, 2026
Viaarxiv icon

QCAgent: An agentic framework for quality-controllable pathology report generation from whole slide image

Add code
Mar 02, 2026
Viaarxiv icon

DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent

Add code
Mar 01, 2026
Viaarxiv icon

STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification

Add code
Feb 28, 2026
Viaarxiv icon

VocalNet-MDM: Accelerating Streaming Speech LLM via Self-Distilled Masked Diffusion Modeling

Add code
Feb 09, 2026
Viaarxiv icon