Picture for Mao Hong

Mao Hong

MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning

Add code
Jan 21, 2024
Viaarxiv icon

A Policy Gradient Method for Confounded POMDPs

Add code
May 26, 2023
Figure 1 for A Policy Gradient Method for Confounded POMDPs
Figure 2 for A Policy Gradient Method for Confounded POMDPs
Figure 3 for A Policy Gradient Method for Confounded POMDPs
Figure 4 for A Policy Gradient Method for Confounded POMDPs
Viaarxiv icon