Picture for Fan-Ming Luo

Fan-Ming Luo

Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate

Add code
May 24, 2024
Viaarxiv icon

Reward-Consistent Dynamics Models are Strongly Generalizable for Offline Reinforcement Learning

Add code
Oct 09, 2023
Viaarxiv icon

Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games

Add code
Aug 19, 2022
Figure 1 for Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games
Figure 2 for Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games
Figure 3 for Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games
Figure 4 for Unified Policy Optimization for Continuous-action Reinforcement Learning in Non-stationary Tasks and Games
Viaarxiv icon

A Survey on Model-based Reinforcement Learning

Add code
Jun 19, 2022
Figure 1 for A Survey on Model-based Reinforcement Learning
Viaarxiv icon

Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble

Add code
Jun 01, 2022
Figure 1 for Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble
Figure 2 for Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble
Figure 3 for Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble
Figure 4 for Transferable Reward Learning by Dynamics-Agnostic Discriminator Ensemble
Viaarxiv icon