Picture for Priyank Agrawal

Priyank Agrawal

Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms

Add code
Oct 11, 2024
Figure 1 for Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms
Figure 2 for Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms
Viaarxiv icon

Optimistic Q-learning for average reward and episodic reinforcement learning

Add code
Jul 18, 2024
Viaarxiv icon

Improved Optimistic Algorithm For The Multinomial Logit Contextual Bandit

Add code
Nov 28, 2020
Figure 1 for Improved Optimistic Algorithm For The Multinomial Logit Contextual Bandit
Viaarxiv icon

Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration

Add code
Oct 23, 2020
Figure 1 for Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration
Viaarxiv icon

Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect

Add code
Jun 18, 2020
Figure 1 for Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect
Figure 2 for Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect
Figure 3 for Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect
Figure 4 for Learning by Repetition: Stochastic Multi-armed Bandits under Priming Effect
Viaarxiv icon

Incentivising Exploration and Recommendations for Contextual Bandits with Payments

Add code
Jan 22, 2020
Figure 1 for Incentivising Exploration and Recommendations for Contextual Bandits with Payments
Figure 2 for Incentivising Exploration and Recommendations for Contextual Bandits with Payments
Viaarxiv icon

Bandits with Temporal Stochastic Constraints

Add code
Nov 22, 2018
Figure 1 for Bandits with Temporal Stochastic Constraints
Figure 2 for Bandits with Temporal Stochastic Constraints
Figure 3 for Bandits with Temporal Stochastic Constraints
Figure 4 for Bandits with Temporal Stochastic Constraints
Viaarxiv icon