Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence

Jan 30, 2023

Carlo Alfano, Rui Yuan, Patrick Rebeschini

Figure 1 for A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence

Figure 2 for A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence

Share this with someone who'll enjoy it:

Abstract:Modern policy optimization methods in applied reinforcement learning are often inspired by the trust region policy optimization algorithm, which can be interpreted as a particular instance of policy mirror descent. While theoretical guarantees have been established for this framework, particularly in the tabular setting, the use of a general parametrization scheme remains mostly unjustified. In this work, we introduce a novel framework for policy optimization based on mirror descent that naturally accommodates general parametrizations. The policy class induced by our scheme recovers known classes, e.g. tabular softmax, log-linear, and neural policies. It also generates new ones, depending on the choice of the mirror map. For a general mirror map and parametrization function, we establish the quasi-monotonicity of the updates in value function, global linear convergence rates, and we bound the total variation of the algorithm along its path. To showcase the ability of our framework to accommodate general parametrization schemes, we present a case study involving shallow neural networks.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:A Novel Framework for Policy Mirror Descent with General Parametrization and Linear Convergence

Paper and Code