Picture for Frans A. Oliehoek

Frans A. Oliehoek

SimuDICE: Offline Policy Optimization Through World Model Updates and DICE Estimation

Add code
Dec 09, 2024
Viaarxiv icon

Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning

Add code
Nov 07, 2024
Viaarxiv icon

Communicating with Speakers and Listeners of Different Pragmatic Levels

Add code
Oct 08, 2024
Viaarxiv icon

Online Planning in POMDPs with State-Requests

Add code
Jul 26, 2024
Viaarxiv icon

Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory

Add code
May 29, 2024
Viaarxiv icon

Policy Space Response Oracles: A Survey

Add code
Mar 04, 2024
Figure 1 for Policy Space Response Oracles: A Survey
Figure 2 for Policy Space Response Oracles: A Survey
Viaarxiv icon

When Do Off-Policy and On-Policy Policy Gradient Methods Align?

Add code
Feb 19, 2024
Figure 1 for When Do Off-Policy and On-Policy Policy Gradient Methods Align?
Figure 2 for When Do Off-Policy and On-Policy Policy Gradient Methods Align?
Figure 3 for When Do Off-Policy and On-Policy Policy Gradient Methods Align?
Figure 4 for When Do Off-Policy and On-Policy Policy Gradient Methods Align?
Viaarxiv icon

What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization

Add code
Nov 19, 2023
Viaarxiv icon

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

Add code
Jun 04, 2023
Viaarxiv icon

What model does MuZero learn?

Add code
Jun 01, 2023
Viaarxiv icon