Picture for Jiafan He

Jiafan He

Accelerated Preference Optimization for Large Language Model Alignment

Add code
Oct 08, 2024
Viaarxiv icon

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback

Add code
Apr 16, 2024
Viaarxiv icon

Settling Constant Regrets in Linear Markov Decision Processes

Add code
Apr 16, 2024
Viaarxiv icon

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

Add code
Feb 15, 2024
Viaarxiv icon

Reinforcement Learning from Human Feedback with Active Queries

Add code
Feb 14, 2024
Viaarxiv icon

Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

Add code
Feb 14, 2024
Viaarxiv icon

A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation

Add code
Nov 26, 2023
Viaarxiv icon

Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning

Add code
Oct 02, 2023
Figure 1 for Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
Viaarxiv icon

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

Add code
May 15, 2023
Viaarxiv icon

Uniform-PAC Guarantees for Model-Based RL with Bounded Eluder Dimension

Add code
May 15, 2023
Viaarxiv icon