Picture for Xiangtai Li

Xiangtai Li

Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models

Add code
Feb 02, 2026
Viaarxiv icon

SAMTok: Representing Any Mask with Two Words

Add code
Jan 22, 2026
Viaarxiv icon

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Add code
Dec 18, 2025
Viaarxiv icon

RecTok: Reconstruction Distillation along Rectified Flow

Add code
Dec 17, 2025
Viaarxiv icon

EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing

Add code
Dec 12, 2025
Viaarxiv icon

WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World

Add code
Dec 11, 2025
Viaarxiv icon

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Add code
Nov 18, 2025
Viaarxiv icon

Visual Spatial Tuning

Add code
Nov 07, 2025
Viaarxiv icon

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

Add code
Oct 30, 2025
Viaarxiv icon

From Masks to Worlds: A Hitchhiker's Guide to World Models

Add code
Oct 23, 2025
Viaarxiv icon