Picture for Siteng Huang

Siteng Huang

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model

Add code
Sep 11, 2025
Viaarxiv icon

Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation

Add code
Aug 28, 2025
Viaarxiv icon

Towards Affordance-Aware Robotic Dexterous Grasping with Human-like Priors

Add code
Aug 12, 2025
Viaarxiv icon

WorldVLA: Towards Autoregressive Action World Model

Add code
Jun 26, 2025
Viaarxiv icon

VARD: Efficient and Dense Fine-Tuning for Diffusion Models with Value-based RL

Add code
May 21, 2025
Viaarxiv icon

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Add code
May 18, 2025
Viaarxiv icon

OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation

Add code
May 06, 2025
Viaarxiv icon

Unicorn: Text-Only Data Synthesis for Vision Language Model Training

Add code
Mar 28, 2025
Viaarxiv icon

Exploring the Evolution of Physics Cognition in Video Generation: A Survey

Add code
Mar 27, 2025
Viaarxiv icon

Humanoid-VLA: Towards Universal Humanoid Control with Visual Integration

Add code
Feb 21, 2025
Viaarxiv icon