Picture for Ashutosh Nayyar

Ashutosh Nayyar

Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning

Add code
Aug 17, 2024
Viaarxiv icon

Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget

Add code
May 23, 2024
Viaarxiv icon

Model approximation in MDPs with unbounded per-step cost

Add code
Feb 13, 2024
Figure 1 for Model approximation in MDPs with unbounded per-step cost
Figure 2 for Model approximation in MDPs with unbounded per-step cost
Figure 3 for Model approximation in MDPs with unbounded per-step cost
Figure 4 for Model approximation in MDPs with unbounded per-step cost
Viaarxiv icon

Regret Analysis of the Posterior Sampling-based Learning Algorithm for Episodic POMDPs

Add code
Oct 16, 2023
Viaarxiv icon

Conditional Kernel Imitation Learning for Continuous State Environments

Add code
Aug 24, 2023
Viaarxiv icon

Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes

Add code
May 24, 2023
Viaarxiv icon

A Novel Point-based Algorithm for Multi-agent Control Using the Common Information Approach

Add code
Apr 10, 2023
Viaarxiv icon

Learning Zero-sum Stochastic Games with Posterior Sampling

Add code
Sep 08, 2021
Viaarxiv icon

A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

Add code
Aug 19, 2021
Figure 1 for A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems
Figure 2 for A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems
Figure 3 for A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems
Viaarxiv icon

Scalable regret for learning to control network-coupled subsystems with unknown dynamics

Add code
Aug 18, 2021
Figure 1 for Scalable regret for learning to control network-coupled subsystems with unknown dynamics
Figure 2 for Scalable regret for learning to control network-coupled subsystems with unknown dynamics
Figure 3 for Scalable regret for learning to control network-coupled subsystems with unknown dynamics
Viaarxiv icon