Picture for Csaba Szepesvári

Csaba Szepesvári

Almost Free: Self-concordance in Natural Exponential Families and an Application to Bandits

Add code
Oct 01, 2024
Viaarxiv icon

Confident Natural Policy Gradient for Local Planning in $q_π$-realizable Constrained MDPs

Add code
Jun 26, 2024
Viaarxiv icon

Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^π$-Realizability and Concentrability

Add code
May 27, 2024
Viaarxiv icon

Regret Minimization via Saddle Point Optimization

Add code
Mar 15, 2024
Viaarxiv icon

Switching the Loss Reduces the Cost in Batch Reinforcement Learning

Add code
Mar 12, 2024
Viaarxiv icon

Ensemble sampling for linear bandits: small ensembles suffice

Add code
Nov 14, 2023
Viaarxiv icon

Exploration via linearly perturbed loss minimisation

Add code
Nov 13, 2023
Viaarxiv icon

Stochastic Gradient Descent for Gaussian Processes Done Right

Add code
Oct 31, 2023
Viaarxiv icon

Online RL in Linearly $q^π$-Realizable MDPs Is as Easy as in Linear MDPs If You Learn What to Ignore

Add code
Oct 11, 2023
Viaarxiv icon

The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation

Add code
Jul 25, 2023
Viaarxiv icon