Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Jun 14, 2021

Anas Barakat, Pascal Bianchi, Julien Lehmann

Figure 1 for Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Share this with someone who'll enjoy it:

Abstract:Actor-critic methods integrating target networks have exhibited a stupendous empirical success in deep reinforcement learning. However, a theoretical understanding of the use of target networks in actor-critic methods is largely missing in the literature. In this paper, we bridge this gap between theory and practice by proposing the first theoretical analysis of an online target-based actor-critic algorithm with linear function approximation in the discounted reward setting. Our algorithm uses three different timescales: one for the actor and two for the critic. Instead of using the standard single timescale temporal difference (TD) learning algorithm as a critic, we use a two timescales target-based version of TD learning closely inspired from practical actor-critic algorithms implementing target networks. First, we establish asymptotic convergence results for both the critic and the actor under Markovian sampling. Then, we provide a finite-time analysis showing the impact of incorporating a target network into actor-critic methods.

* 34 pages

View paper on

Share this with someone who'll enjoy it:

Title:Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Paper and Code