Picture for Long-Fei Li

Long-Fei Li

Provably Efficient RLHF Pipeline: A Unified View from Contextual Bandits

Add code
Feb 11, 2025
Viaarxiv icon

Near-Optimal Dynamic Regret for Adversarial Linear Mixture MDPs

Add code
Nov 05, 2024
Viaarxiv icon

Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition

Add code
Mar 07, 2024
Viaarxiv icon

Dynamic Regret of Online Markov Decision Processes

Add code
Aug 26, 2022
Figure 1 for Dynamic Regret of Online Markov Decision Processes
Figure 2 for Dynamic Regret of Online Markov Decision Processes
Figure 3 for Dynamic Regret of Online Markov Decision Processes
Figure 4 for Dynamic Regret of Online Markov Decision Processes
Viaarxiv icon