Picture for Philip S. Thomas

Philip S. Thomas

Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation

Add code
Oct 03, 2024
Viaarxiv icon

Position: Benchmarking is Limited in Reinforcement Learning Research

Add code
Jun 23, 2024
Viaarxiv icon

ICU-Sepsis: A Benchmark MDP Built from Real Medical Data

Add code
Jun 09, 2024
Figure 1 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Figure 2 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Figure 3 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Figure 4 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Viaarxiv icon

From Past to Future: Rethinking Eligibility Traces

Add code
Dec 20, 2023
Viaarxiv icon

Behavior Alignment via Reward Function Optimization

Add code
Oct 31, 2023
Viaarxiv icon

Learning Fair Representations with High-Confidence Guarantees

Add code
Oct 23, 2023
Viaarxiv icon

Coagent Networks: Generalized and Scaled

Add code
May 16, 2023
Figure 1 for Coagent Networks: Generalized and Scaled
Figure 2 for Coagent Networks: Generalized and Scaled
Figure 3 for Coagent Networks: Generalized and Scaled
Figure 4 for Coagent Networks: Generalized and Scaled
Viaarxiv icon

Optimization using Parallel Gradient Evaluations on Multiple Parameters

Add code
Feb 06, 2023
Viaarxiv icon

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Add code
Jan 24, 2023
Viaarxiv icon

Enforcing Delayed-Impact Fairness Guarantees

Add code
Aug 24, 2022
Figure 1 for Enforcing Delayed-Impact Fairness Guarantees
Figure 2 for Enforcing Delayed-Impact Fairness Guarantees
Figure 3 for Enforcing Delayed-Impact Fairness Guarantees
Figure 4 for Enforcing Delayed-Impact Fairness Guarantees
Viaarxiv icon