Picture for Gandharv Patil

Gandharv Patil

On learning history based policies for controlling Markov decision processes

Add code
Nov 06, 2022
Viaarxiv icon

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

Add code
Oct 12, 2022
Figure 1 for Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation
Figure 2 for Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation
Figure 3 for Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation
Viaarxiv icon

Variance Penalized On-Policy and Off-Policy Actor-Critic

Add code
Feb 03, 2021
Figure 1 for Variance Penalized On-Policy and Off-Policy Actor-Critic
Figure 2 for Variance Penalized On-Policy and Off-Policy Actor-Critic
Figure 3 for Variance Penalized On-Policy and Off-Policy Actor-Critic
Figure 4 for Variance Penalized On-Policy and Off-Policy Actor-Critic
Viaarxiv icon