Picture for Bruno Castro da Silva

Bruno Castro da Silva

Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation

Add code
Oct 03, 2024
Viaarxiv icon

Position: Benchmarking is Limited in Reinforcement Learning Research

Add code
Jun 23, 2024
Viaarxiv icon

RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs

Add code
Apr 12, 2024
Viaarxiv icon

From Past to Future: Rethinking Eligibility Traces

Add code
Dec 20, 2023
Viaarxiv icon

Behavior Alignment via Reward Function Optimization

Add code
Oct 31, 2023
Viaarxiv icon

Coagent Networks: Generalized and Scaled

Add code
May 16, 2023
Figure 1 for Coagent Networks: Generalized and Scaled
Figure 2 for Coagent Networks: Generalized and Scaled
Figure 3 for Coagent Networks: Generalized and Scaled
Figure 4 for Coagent Networks: Generalized and Scaled
Viaarxiv icon

Off-Policy Evaluation for Action-Dependent Non-Stationary Environments

Add code
Jan 24, 2023
Viaarxiv icon

Model-Based Reinforcement Learning with SINDy

Add code
Aug 30, 2022
Figure 1 for Model-Based Reinforcement Learning with SINDy
Figure 2 for Model-Based Reinforcement Learning with SINDy
Viaarxiv icon

Enforcing Delayed-Impact Fairness Guarantees

Add code
Aug 24, 2022
Figure 1 for Enforcing Delayed-Impact Fairness Guarantees
Figure 2 for Enforcing Delayed-Impact Fairness Guarantees
Figure 3 for Enforcing Delayed-Impact Fairness Guarantees
Figure 4 for Enforcing Delayed-Impact Fairness Guarantees
Viaarxiv icon

Universal Off-Policy Evaluation

Add code
Apr 26, 2021
Figure 1 for Universal Off-Policy Evaluation
Figure 2 for Universal Off-Policy Evaluation
Figure 3 for Universal Off-Policy Evaluation
Figure 4 for Universal Off-Policy Evaluation
Viaarxiv icon