Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chenyang Miao

Multi-Goal Dexterous Hand Manipulation using Probabilistic Model-based Reinforcement Learning

Apr 30, 2025

Yingzhuo Jiang, Wenjun Huang, Rongdun Lin, Chenyang Miao, Tianfu Sun, Yunduan Cui

Abstract:This paper tackles the challenge of learning multi-goal dexterous hand manipulation tasks using model-based Reinforcement Learning. We propose Goal-Conditioned Probabilistic Model Predictive Control (GC-PMPC) by designing probabilistic neural network ensembles to describe the high-dimensional dexterous hand dynamics and introducing an asynchronous MPC policy to meet the control frequency requirements in real-world dexterous hand systems. Extensive evaluations on four simulated Shadow Hand manipulation scenarios with randomly generated goals demonstrate GC-PMPC's superior performance over state-of-the-art baselines. It successfully drives a cable-driven Dexterous hand, DexHand 021 with 12 Active DOFs and 5 tactile sensors, to learn manipulating a cubic die to three goal poses within approximately 80 minutes of interactions, demonstrating exceptional learning efficiency and control performance on a cost-effective dexterous hand platform.

Via

Access Paper or Ask Questions

Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization

Sep 26, 2023

Chenyang Miao, Yunduan Cui, Huiyun Li, Xinyu Wu

Figure 1 for Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization

Figure 2 for Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization

Figure 3 for Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization

Figure 4 for Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization

Abstract:In this paper, a novel Multi-agent Reinforcement Learning (MARL) approach, Multi-Agent Continuous Dynamic Policy Gradient (MACDPP) was proposed to tackle the issues of limited capability and sample efficiency in various scenarios controlled by multiple agents. It alleviates the inconsistency of multiple agents' policy updates by introducing the relative entropy regularization to the Centralized Training with Decentralized Execution (CTDE) framework with the Actor-Critic (AC) structure. Evaluated by multi-agent cooperation and competition tasks and traditional control tasks including OpenAI benchmarks and robot arm manipulation, MACDPP demonstrates significant superiority in learning capability and sample efficiency compared with both related multi-agent and widely implemented signal-agent baselines and therefore expands the potential of MARL in effectively learning challenging control scenarios.

Via

Access Paper or Ask Questions