Picture for Mao Hong

Mao Hong

MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning

Add code
Jan 21, 2024
Viaarxiv icon

A Policy Gradient Method for Confounded POMDPs

Add code
May 26, 2023
Viaarxiv icon