Picture for Shangtong Zhang

Shangtong Zhang

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning

Add code
Oct 08, 2024
Viaarxiv icon

Doubly Optimal Policy Evaluation for Reinforcement Learning

Add code
Oct 03, 2024
Viaarxiv icon

Almost Sure Convergence of Average Reward Temporal Difference Learning

Add code
Sep 29, 2024
Viaarxiv icon

Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features

Add code
Sep 18, 2024
Viaarxiv icon

Efficient Multi-Policy Evaluation for Reinforcement Learning

Add code
Aug 16, 2024
Viaarxiv icon

Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning

Add code
May 22, 2024
Viaarxiv icon

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Add code
Feb 06, 2024
Viaarxiv icon

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

Add code
Aug 07, 2023
Viaarxiv icon

Direct Gradient Temporal Difference Learning

Add code
Aug 02, 2023
Viaarxiv icon

Improving Monte Carlo Evaluation with Offline Data

Add code
Jan 31, 2023
Viaarxiv icon