Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sanzhar Rakhimkul

Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Mar 15, 2021

Thanh Nguyen, Tung Luu, Trung Pham, Sanzhar Rakhimkul, Chang D. Yoo

Figure 1 for Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Figure 2 for Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Figure 3 for Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Figure 4 for Robust MAML: Prioritization task buffer with adaptive learning process for model-agnostic meta-learning

Abstract:Model agnostic meta-learning (MAML) is a popular state-of-the-art meta-learning algorithm that provides good weight initialization of a model given a variety of learning tasks. The model initialized by provided weight can be fine-tuned to an unseen task despite only using a small amount of samples and within a few adaptation steps. MAML is simple and versatile but requires costly learning rate tuning and careful design of the task distribution which affects its scalability and generalization. This paper proposes a more robust MAML based on an adaptive learning scheme and a prioritization task buffer(PTB) referred to as Robust MAML (RMAML) for improving scalability of training process and alleviating the problem of distribution mismatch. RMAML uses gradient-based hyper-parameter optimization to automatically find the optimal learning rate and uses the PTB to gradually adjust train-ing task distribution toward testing task distribution over the course of training. Experimental results on meta reinforcement learning environments demonstrate a substantial performance gain as well as being less sensitive to hyper-parameter choice and robust to distribution mismatch.

Via

Access Paper or Ask Questions