Picture for Pengxiang Ding

Pengxiang Ding

VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation

Add code
Feb 19, 2025
Viaarxiv icon

Score-Based Diffusion Policy Compatible with Reinforcement Learning via Optimal Transport

Add code
Feb 18, 2025
Viaarxiv icon

GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation

Add code
Feb 13, 2025
Viaarxiv icon

Rethinking Latent Representations in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation

Add code
Feb 05, 2025
Viaarxiv icon

QUART-Online: Latency-Free Large Multimodal Language Model for Quadruped Robot Learning

Add code
Dec 23, 2024
Viaarxiv icon

Score and Distribution Matching Policy: Advanced Accelerated Visuomotor Policies via Matched Distillation

Add code
Dec 13, 2024
Viaarxiv icon

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Add code
Dec 09, 2024
Figure 1 for CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Figure 2 for CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Figure 3 for CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Figure 4 for CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Viaarxiv icon

Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration

Add code
Nov 26, 2024
Viaarxiv icon

ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification

Add code
Sep 30, 2024
Figure 1 for ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification
Figure 2 for ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification
Figure 3 for ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification
Figure 4 for ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification
Viaarxiv icon

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

Add code
Sep 11, 2024
Viaarxiv icon