Picture for Reda Ouhamma

Reda Ouhamma

CRIStAL

Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability

Add code
Dec 13, 2023
Viaarxiv icon

Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and Planning

Add code
Oct 05, 2022
Figure 1 for Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and Planning
Figure 2 for Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and Planning
Viaarxiv icon

Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

Add code
Nov 02, 2021
Figure 1 for Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge
Figure 2 for Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge
Figure 3 for Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge
Figure 4 for Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge
Viaarxiv icon

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

Add code
Oct 18, 2021
Figure 1 for Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits
Figure 2 for Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits
Figure 3 for Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits
Figure 4 for Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits
Viaarxiv icon

Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients

Add code
Oct 09, 2020
Figure 1 for Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients
Figure 2 for Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients
Figure 3 for Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients
Figure 4 for Is Standard Deviation the New Standard? Revisiting the Critic in Deep Policy Gradients
Viaarxiv icon