Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Sep 21, 2022

Haozhi Wang, Qing Wang, Yunfeng Shao, Dong Li, Jianye Hao, Yinchuan Li

Figure 1 for On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Figure 2 for On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Figure 3 for On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Figure 4 for On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Share this with someone who'll enjoy it:

Abstract:Modern meta-reinforcement learning (Meta-RL) methods are mainly developed based on model-agnostic meta-learning, which performs policy gradient steps across tasks to maximize policy performance. However, the gradient conflict problem is still poorly understood in Meta-RL, which may lead to performance degradation when encountering distinct tasks. To tackle this challenge, this paper proposes a novel personalized Meta-RL (pMeta-RL) algorithm, which aggregates task-specific personalized policies to update a meta-policy used for all tasks, while maintaining personalized policies to maximize the average return of each task under the constraint of the meta-policy. We also provide the theoretical analysis under the tabular setting, which demonstrates the convergence of our pMeta-RL algorithm. Moreover, we extend the proposed pMeta-RL algorithm to a deep network version based on soft actor-critic, making it suitable for continuous control tasks. Experiment results show that the proposed algorithms outperform other previous Meta-RL algorithms on Gym and MuJoCo suites.

View paper on

Share this with someone who'll enjoy it:

Title:On the Convergence Theory of Meta Reinforcement Learning with Personalized Policies

Paper and Code