Picture for Wanqiao Xu

Wanqiao Xu

Exploration Unbound

Add code
Jul 16, 2024
Figure 1 for Exploration Unbound
Viaarxiv icon

Pearl: A Production-ready Reinforcement Learning Agent

Add code
Dec 06, 2023
Viaarxiv icon

RLHF and IIA: Perverse Incentives

Add code
Dec 02, 2023
Figure 1 for RLHF and IIA: Perverse Incentives
Figure 2 for RLHF and IIA: Perverse Incentives
Figure 3 for RLHF and IIA: Perverse Incentives
Figure 4 for RLHF and IIA: Perverse Incentives
Viaarxiv icon

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Add code
May 19, 2023
Viaarxiv icon

Posterior Sampling for Continuing Environments

Add code
Nov 29, 2022
Viaarxiv icon

Safely Bridging Offline and Online Reinforcement Learning

Add code
Oct 25, 2021
Figure 1 for Safely Bridging Offline and Online Reinforcement Learning
Viaarxiv icon