Picture for Xiangyu Yue

Xiangyu Yue

Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

Add code
Jun 08, 2026
Viaarxiv icon

Benchmark Everything Everywhere All at Once

Add code
Jun 04, 2026
Viaarxiv icon

X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding

Add code
Jun 01, 2026
Viaarxiv icon

$τ_0$-WM: A Unified Video-Action World Model for Robotic Manipulation

Add code
May 31, 2026
Viaarxiv icon

Learning Structural Latent Points for Efficient Visual Representations in Robotic Manipulation

Add code
May 20, 2026
Viaarxiv icon

BitLM: Unlocking Multi-Token Language Generation with Bitwise Continuous Diffusion

Add code
May 12, 2026
Viaarxiv icon

From Web to Pixels: Bringing Agentic Search into Visual Perception

Add code
May 12, 2026
Viaarxiv icon

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Add code
May 06, 2026
Viaarxiv icon

A Progressive Training Strategy for Vision-Language Models to Counteract Spatio-Temporal Hallucinations in Embodied Reasoning

Add code
Apr 12, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon