Picture for Rui Shao

Rui Shao

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Add code
Apr 15, 2026
Viaarxiv icon

Multimodal Dataset Distillation via Phased Teacher Models

Add code
Mar 26, 2026
Viaarxiv icon

HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

Add code
Mar 12, 2026
Viaarxiv icon

$Δ$VLA: Prior-Guided Vision-Language-Action Models via World Knowledge Variation

Add code
Mar 09, 2026
Viaarxiv icon

Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation

Add code
Feb 22, 2026
Viaarxiv icon

Inject Once Survive Later: Backdooring Vision-Language-Action Models to Persist Through Downstream Fine-tuning

Add code
Jan 31, 2026
Viaarxiv icon

Learning to Accelerate Vision-Language-Action Models through Adaptive Visual Token Caching

Add code
Jan 31, 2026
Viaarxiv icon

PersonalAlign: Hierarchical Implicit Intent Alignment for Personalized GUI Agent with Long-Term User-Centric Records

Add code
Jan 14, 2026
Viaarxiv icon

SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation

Add code
Nov 13, 2025
Figure 1 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Figure 2 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Figure 3 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Figure 4 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Viaarxiv icon

CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification

Add code
Aug 28, 2025
Figure 1 for CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
Figure 2 for CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
Figure 3 for CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
Figure 4 for CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
Viaarxiv icon