Picture for Alexander Rakhlin

Alexander Rakhlin

Refined Risk Bounds for Unbounded Losses via Transductive Priors

Add code
Oct 29, 2024
Viaarxiv icon

How Does Variance Shape the Regret in Contextual Bandits?

Add code
Oct 16, 2024
Viaarxiv icon

Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability

Add code
Oct 07, 2024
Viaarxiv icon

Random Latent Exploration for Deep Reinforcement Learning

Add code
Jul 18, 2024
Viaarxiv icon

Near-Optimal Learning and Planning in Separated Latent MDPs

Add code
Jun 12, 2024
Viaarxiv icon

Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

Add code
May 31, 2024
Figure 1 for Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Figure 2 for Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Viaarxiv icon

The Power of Resets in Online Reinforcement Learning

Add code
Apr 26, 2024
Viaarxiv icon

Online Estimation via Offline Estimation: An Information-Theoretic Framework

Add code
Apr 15, 2024
Viaarxiv icon

Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data

Add code
Mar 25, 2024
Viaarxiv icon

On the Performance of Empirical Risk Minimization with Smoothed Data

Add code
Feb 22, 2024
Viaarxiv icon