Picture for Shanghang Zhang

Shanghang Zhang

EmbodiedOcc++: Boosting Embodied 3D Occupancy Prediction with Plane Regularization and Uncertainty Sampler

Add code
Apr 13, 2025
Viaarxiv icon

Segment Any Motion in Videos

Add code
Mar 28, 2025
Viaarxiv icon

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning

Add code
Mar 27, 2025
Viaarxiv icon

MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation

Add code
Mar 26, 2025
Viaarxiv icon

EmpathyAgent: Can Embodied Agents Conduct Empathetic Actions?

Add code
Mar 19, 2025
Viaarxiv icon

HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model

Add code
Mar 13, 2025
Viaarxiv icon

RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete

Add code
Feb 28, 2025
Viaarxiv icon

MapNav: A Novel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation

Add code
Feb 19, 2025
Viaarxiv icon

CordViP: Correspondence-based Visuomotor Policy for Dexterous Manipulation in Real-World

Add code
Feb 12, 2025
Viaarxiv icon

LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information

Add code
Feb 04, 2025
Figure 1 for LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information
Figure 2 for LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information
Figure 3 for LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information
Figure 4 for LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information
Viaarxiv icon