Picture for Jalaj Bhandari

Jalaj Bhandari

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Dec 06, 2023
Viaarxiv icon

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Add code
May 24, 2023
Viaarxiv icon

A Note on the Linear Convergence of Policy Gradient Methods

Add code
Jul 21, 2020
Figure 1 for A Note on the Linear Convergence of Policy Gradient Methods
Viaarxiv icon

Global Optimality Guarantees For Policy Gradient Methods

Add code
Jun 05, 2019
Figure 1 for Global Optimality Guarantees For Policy Gradient Methods
Viaarxiv icon

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation

Add code
Nov 06, 2018
Figure 1 for A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Viaarxiv icon