Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joni-Kristen Kämäräinen

Swapped goal-conditioned offline reinforcement learning

Feb 17, 2023

Wenyan Yang, Huiling Wang, Dingding Cai, Joni Pajarinen, Joni-Kristen Kämäräinen

Abstract:Offline goal-conditioned reinforcement learning (GCRL) can be challenging due to overfitting to the given dataset. To generalize agents' skills outside the given dataset, we propose a goal-swapping procedure that generates additional trajectories. To alleviate the problem of noise and extrapolation errors, we present a general offline reinforcement learning method called deterministic Q-advantage policy gradient (DQAPG). In the experiments, DQAPG outperforms state-of-the-art goal-conditioned offline RL methods in a wide range of benchmark tasks, and goal-swapping further improves the test results. It is noteworthy, that the proposed method obtains good performance on the challenging dexterous in-hand manipulation tasks for which the prior methods failed.

* arXiv admin note: text overlap with arXiv:2302.07741

Via

Access Paper or Ask Questions