Picture for Aldo Pacchiano

Aldo Pacchiano

ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization

Add code
Oct 17, 2024
Viaarxiv icon

State-free Reinforcement Learning

Add code
Sep 27, 2024
Viaarxiv icon

Second Order Bounds for Contextual Bandits with Function Approximation

Add code
Sep 24, 2024
Viaarxiv icon

Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives

Add code
Aug 07, 2024
Figure 1 for Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Figure 2 for Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Figure 3 for Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Viaarxiv icon

Provable Interactive Learning with Hindsight Instruction Feedback

Add code
Apr 14, 2024
Viaarxiv icon

Multiple-policy Evaluation via Density Estimation

Add code
Mar 29, 2024
Viaarxiv icon

Provably Sample Efficient RLHF via Active Preference Optimization

Add code
Feb 16, 2024
Viaarxiv icon

A Framework for Partially Observed Reward-States in RLHF

Add code
Feb 05, 2024
Viaarxiv icon

Contextual Bandits with Stage-wise Constraints

Add code
Jan 15, 2024
Viaarxiv icon

Experiment Planning with Function Approximation

Add code
Jan 10, 2024
Viaarxiv icon