Picture for Bin Xie

Bin Xie

PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models

Add code
May 11, 2026
Viaarxiv icon

GS-Playground: A High-Throughput Photorealistic Simulator for Vision-Informed Robot Learning

Add code
Apr 28, 2026
Viaarxiv icon

HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

Add code
Mar 12, 2026
Viaarxiv icon

DM0: An Embodied-Native Vision-Language-Action Model towards Physical AI

Add code
Feb 16, 2026
Viaarxiv icon

Towards Robust Process Reward Modeling via Noise-aware Learning

Add code
Jan 19, 2026
Viaarxiv icon

R$^2$PO: Decoupling Training Trajectories from Inference Responses for LLM Reasoning

Add code
Jan 17, 2026
Viaarxiv icon

MaskMed: Decoupled Mask and Class Prediction for Medical Image Segmentation

Add code
Nov 19, 2025
Viaarxiv icon

SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation

Add code
Nov 12, 2025
Viaarxiv icon

MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation

Add code
Aug 26, 2025
Figure 1 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Figure 2 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Figure 3 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Figure 4 for MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
Viaarxiv icon

GeoVLA: Empowering 3D Representations in Vision-Language-Action Models

Add code
Aug 12, 2025
Figure 1 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Figure 2 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Figure 3 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Figure 4 for GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
Viaarxiv icon