Picture for Guoxi Zhang

Guoxi Zhang

VickreyFeedback: Cost-efficient Data Construction for Reinforcement Learning from Human Feedback

Add code
Sep 27, 2024
Viaarxiv icon

INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations

Add code
Mar 19, 2024
Viaarxiv icon

Online Policy Learning from Offline Preferences

Add code
Mar 15, 2024
Figure 1 for Online Policy Learning from Offline Preferences
Figure 2 for Online Policy Learning from Offline Preferences
Figure 3 for Online Policy Learning from Offline Preferences
Figure 4 for Online Policy Learning from Offline Preferences
Viaarxiv icon

Estimating Treatment Effects Under Heterogeneous Interference

Add code
Sep 25, 2023
Viaarxiv icon

On Modeling Long-Term User Engagement from Stochastic Feedback

Add code
Feb 13, 2023
Viaarxiv icon

Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning

Add code
Nov 29, 2022
Viaarxiv icon

Batch Reinforcement Learning from Crowds

Add code
Nov 08, 2021
Figure 1 for Batch Reinforcement Learning from Crowds
Figure 2 for Batch Reinforcement Learning from Crowds
Figure 3 for Batch Reinforcement Learning from Crowds
Figure 4 for Batch Reinforcement Learning from Crowds
Viaarxiv icon