Picture for Peter Wurman

Peter Wurman

The Trajectory Alignment Coefficient in Two Acts: From Reward Tuning to Reward Learning

Add code
Jan 23, 2026
Viaarxiv icon

Event Tables for Efficient Experience Replay

Add code
Nov 01, 2022
Viaarxiv icon

Reinforcement Learning for Optimization of COVID-19 Mitigation policies

Add code
Oct 20, 2020
Figure 1 for Reinforcement Learning for Optimization of COVID-19 Mitigation policies
Figure 2 for Reinforcement Learning for Optimization of COVID-19 Mitigation policies
Figure 3 for Reinforcement Learning for Optimization of COVID-19 Mitigation policies
Figure 4 for Reinforcement Learning for Optimization of COVID-19 Mitigation policies
Viaarxiv icon