Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arash Bahari Kordabad

Quasi-Newton Iteration in Deterministic Policy Gradient

Mar 25, 2022

Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, Wenqi Cai, Sebastien Gros

Figure 1 for Quasi-Newton Iteration in Deterministic Policy Gradient

Figure 2 for Quasi-Newton Iteration in Deterministic Policy Gradient

Figure 3 for Quasi-Newton Iteration in Deterministic Policy Gradient

Abstract:This paper presents a model-free approximation for the Hessian of the performance of deterministic policies to use in the context of Reinforcement Learning based on Quasi-Newton steps in the policy parameters. We show that the approximate Hessian converges to the exact Hessian at the optimal policy, and allows for a superlinear convergence in the learning, provided that the policy parametrization is rich. The natural policy gradient method can be interpreted as a particular case of the proposed method. We analytically verify the formulation in a simple linear case and compare the convergence of the proposed method with the natural policy gradient in a nonlinear example.

* This paper has been accepted to 2022 American Control Conference (ACC). 6 pages

Via

Access Paper or Ask Questions

Approximate Robust NMPC using Reinforcement Learning

Apr 06, 2021

Hossein Nejatbakhsh Esfahani, Arash Bahari Kordabad, Sebastien Gros

Figure 1 for Approximate Robust NMPC using Reinforcement Learning

Figure 2 for Approximate Robust NMPC using Reinforcement Learning

Figure 3 for Approximate Robust NMPC using Reinforcement Learning

Figure 4 for Approximate Robust NMPC using Reinforcement Learning

Abstract:We present a Reinforcement Learning-based Robust Nonlinear Model Predictive Control (RL-RNMPC) framework for controlling nonlinear systems in the presence of disturbances and uncertainties. An approximate Robust Nonlinear Model Predictive Control (RNMPC) of low computational complexity is used in which the state trajectory uncertainty is modelled via ellipsoids. Reinforcement Learning is then used in order to handle the ellipsoidal approximation and improve the closed-loop performance of the scheme by adjusting the MPC parameters generating the ellipsoids. The approach is tested on a simulated Wheeled Mobile Robot (WMR) tracking a desired trajectory while avoiding static obstacles.

* This paper has been accepted to 2021 European Control Conference (ECC)

Via

Access Paper or Ask Questions

MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

Apr 06, 2021

Arash Bahari Kordabad, Wenqi Cai, Sebastien Gros

Figure 1 for MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

Figure 2 for MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

Figure 3 for MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

Figure 4 for MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

Abstract:In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presence of unmodelled stochasticity or model error. When the policy has a (nearly) bang-bang structure, we observe that the policy gradient method can struggle to produce meaningful steps in the policy parameters. To tackle this issue, we propose a homotopy strategy based on the interior-point method, providing a relaxation of the policy during the learning. We investigate a specific well-known battery storage problem, and show that the proposed method delivers a homogeneous and faster learning than a classical policy gradient approach.

* This paper has been accepted to ECC2021. 6 pages

Via

Access Paper or Ask Questions