Picture for Xiangtai Li

Xiangtai Li

From Masks to Worlds: A Hitchhiker's Guide to World Models

Add code
Oct 23, 2025
Viaarxiv icon

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Add code
Oct 23, 2025
Viaarxiv icon

Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Add code
Oct 22, 2025
Viaarxiv icon

One Flight Over the Gap: A Survey from Perspective to Panoramic Vision

Add code
Sep 04, 2025
Viaarxiv icon

PointDGRWKV: Generalizing RWKV-like Architecture to Unseen Domains for Point Cloud Classification

Add code
Aug 29, 2025
Figure 1 for PointDGRWKV: Generalizing RWKV-like Architecture to Unseen Domains for Point Cloud Classification
Figure 2 for PointDGRWKV: Generalizing RWKV-like Architecture to Unseen Domains for Point Cloud Classification
Figure 3 for PointDGRWKV: Generalizing RWKV-like Architecture to Unseen Domains for Point Cloud Classification
Figure 4 for PointDGRWKV: Generalizing RWKV-like Architecture to Unseen Domains for Point Cloud Classification
Viaarxiv icon

Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning

Add code
Aug 14, 2025
Viaarxiv icon

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology

Add code
Jul 10, 2025
Viaarxiv icon

Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning

Add code
Jul 02, 2025
Viaarxiv icon

Dense360: Dense Understanding from Omnidirectional Panoramas

Add code
Jun 17, 2025
Viaarxiv icon

Omni-AdaVideoRAG: Omni-Contextual Adaptive Retrieval-Augmented for Efficient Long Video Understanding

Add code
Jun 16, 2025
Viaarxiv icon