Picture for Yaxin Peng

Yaxin Peng

PointVLA: Injecting the 3D World into Vision-Language-Action Models

Add code
Mar 10, 2025
Viaarxiv icon

ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration

Add code
Feb 26, 2025
Viaarxiv icon

ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model

Add code
Feb 21, 2025
Viaarxiv icon

Efficient Feature Fusion for UAV Object Detection

Add code
Jan 29, 2025
Viaarxiv icon

Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning

Add code
Jan 04, 2025
Figure 1 for Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning
Figure 2 for Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning
Figure 3 for Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning
Figure 4 for Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning
Viaarxiv icon

Improving Vision-Language-Action Models via Chain-of-Affordance

Add code
Dec 29, 2024
Figure 1 for Improving Vision-Language-Action Models via Chain-of-Affordance
Figure 2 for Improving Vision-Language-Action Models via Chain-of-Affordance
Figure 3 for Improving Vision-Language-Action Models via Chain-of-Affordance
Figure 4 for Improving Vision-Language-Action Models via Chain-of-Affordance
Viaarxiv icon

Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression

Add code
Dec 04, 2024
Viaarxiv icon

Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation

Add code
Sep 22, 2024
Figure 1 for Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation
Figure 2 for Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation
Figure 3 for Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation
Figure 4 for Scaling Diffusion Policy in Transformer to 1 Billion Parameters for Robotic Manipulation
Viaarxiv icon

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

Add code
Jun 28, 2024
Figure 1 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 2 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 3 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Figure 4 for MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
Viaarxiv icon

Mipha: A Comprehensive Overhaul of Multimodal Assistant with Small Language Models

Add code
Mar 15, 2024
Viaarxiv icon