Picture for Mridul Agarwal

Mridul Agarwal

Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach

Add code
Sep 13, 2021
Figure 1 for Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach
Figure 2 for Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach
Figure 3 for Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach
Viaarxiv icon

Concave Utility Reinforcement Learning with Zero-Constraint Violations

Add code
Sep 12, 2021
Figure 1 for Concave Utility Reinforcement Learning with Zero-Constraint Violations
Figure 2 for Concave Utility Reinforcement Learning with Zero-Constraint Violations
Figure 3 for Concave Utility Reinforcement Learning with Zero-Constraint Violations
Figure 4 for Concave Utility Reinforcement Learning with Zero-Constraint Violations
Viaarxiv icon

On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)

Add code
Sep 09, 2021
Viaarxiv icon

Markov Decision Processes with Long-Term Average Constraints

Add code
Jun 12, 2021
Figure 1 for Markov Decision Processes with Long-Term Average Constraints
Figure 2 for Markov Decision Processes with Long-Term Average Constraints
Figure 3 for Markov Decision Processes with Long-Term Average Constraints
Figure 4 for Markov Decision Processes with Long-Term Average Constraints
Viaarxiv icon

Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm

Add code
May 28, 2021
Figure 1 for Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Figure 2 for Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Figure 3 for Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Viaarxiv icon

Communication Efficient Parallel Reinforcement Learning

Add code
Feb 22, 2021
Figure 1 for Communication Efficient Parallel Reinforcement Learning
Figure 2 for Communication Efficient Parallel Reinforcement Learning
Viaarxiv icon

Multi-Agent Multi-Armed Bandits with Limited Communication

Add code
Feb 10, 2021
Figure 1 for Multi-Agent Multi-Armed Bandits with Limited Communication
Figure 2 for Multi-Agent Multi-Armed Bandits with Limited Communication
Figure 3 for Multi-Agent Multi-Armed Bandits with Limited Communication
Viaarxiv icon

Blind Decision Making: Reinforcement Learning with Delayed Observations

Add code
Nov 16, 2020
Figure 1 for Blind Decision Making: Reinforcement Learning with Delayed Observations
Figure 2 for Blind Decision Making: Reinforcement Learning with Delayed Observations
Viaarxiv icon

DART: aDaptive Accept RejecT for non-linear top-K subset identification

Add code
Nov 16, 2020
Figure 1 for DART: aDaptive Accept RejecT for non-linear top-K subset identification
Figure 2 for DART: aDaptive Accept RejecT for non-linear top-K subset identification
Figure 3 for DART: aDaptive Accept RejecT for non-linear top-K subset identification
Figure 4 for DART: aDaptive Accept RejecT for non-linear top-K subset identification
Viaarxiv icon

Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient Descent

Add code
Oct 03, 2019
Viaarxiv icon