Picture for Zheng Wen

Zheng Wen

Online Bandit Learning with Offline Preference Data

Add code
Jun 13, 2024
Viaarxiv icon

RLHF and IIA: Perverse Incentives

Add code
Dec 02, 2023
Figure 1 for RLHF and IIA: Perverse Incentives
Figure 2 for RLHF and IIA: Perverse Incentives
Figure 3 for RLHF and IIA: Perverse Incentives
Figure 4 for RLHF and IIA: Perverse Incentives
Viaarxiv icon

Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

Add code
Oct 17, 2023
Viaarxiv icon

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale

Add code
Mar 20, 2023
Viaarxiv icon

Approximate Thompson Sampling via Epistemic Neural Networks

Add code
Feb 18, 2023
Viaarxiv icon

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Add code
Feb 08, 2023
Viaarxiv icon

Robustness of Epinets against Distributional Shifts

Add code
Jul 01, 2022
Figure 1 for Robustness of Epinets against Distributional Shifts
Figure 2 for Robustness of Epinets against Distributional Shifts
Figure 3 for Robustness of Epinets against Distributional Shifts
Figure 4 for Robustness of Epinets against Distributional Shifts
Viaarxiv icon

Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping

Add code
Jun 08, 2022
Figure 1 for Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping
Figure 2 for Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping
Figure 3 for Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping
Figure 4 for Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping
Viaarxiv icon

An Analysis of Ensemble Sampling

Add code
Mar 02, 2022
Viaarxiv icon

Evaluating High-Order Predictive Distributions in Deep Learning

Add code
Feb 28, 2022
Figure 1 for Evaluating High-Order Predictive Distributions in Deep Learning
Figure 2 for Evaluating High-Order Predictive Distributions in Deep Learning
Figure 3 for Evaluating High-Order Predictive Distributions in Deep Learning
Figure 4 for Evaluating High-Order Predictive Distributions in Deep Learning
Viaarxiv icon