Picture for Xiangyu Yue

Xiangyu Yue

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Add code
May 06, 2026
Viaarxiv icon

A Progressive Training Strategy for Vision-Language Models to Counteract Spatio-Temporal Hallucinations in Embodied Reasoning

Add code
Apr 12, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Add code
Mar 30, 2026
Viaarxiv icon

GIDE: Unlocking Diffusion LLMs for Precise Training-Free Image Editing

Add code
Mar 22, 2026
Viaarxiv icon

FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair

Add code
Mar 18, 2026
Viaarxiv icon

PreciseCache: Precise Feature Caching for Efficient and High-fidelity Video Generation

Add code
Mar 03, 2026
Viaarxiv icon

Elastic Diffusion Transformer

Add code
Feb 15, 2026
Viaarxiv icon

UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

Add code
Feb 15, 2026
Viaarxiv icon

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Add code
Feb 15, 2026
Viaarxiv icon