Abstract:It has been more than seven decades since the introduction of the theory of dual control \cite{feldbaum1960dual}. Although it has provided rich insights to the fields of control, estimation, and system identification, dual control is generally computationally prohibitive. In recent years, however, the use of Koopman operator theory for control applications has been emerging. The paper presents a new reformulation of the stochastic optimal control problem that, employing the Koopman operator, yields a standard LQR problem with the dual control as its solution. We conclude the paper with a numerical example that demonstrates the effectiveness of the proposed approach, compared to certainty equivalence control, when applied to systems with varying observability.
Abstract:In this paper we provide framework to cope with two problems: (i) the fragility of reinforcement learning due to modeling uncertainties because of the mismatch between controlled laboratory/simulation and real-world conditions and (ii) the prohibitive computational cost of stochastic optimal control. We approach both problems by using reinforcement learning to solve the stochastic dynamic programming equation. The resulting reinforcement learning controller is safe with respect to several types of constraints constraints and it can actively learn about the modeling uncertainties. Unlike exploration and exploitation, probing and safety are employed automatically by the controller itself, resulting real-time learning. A simulation example demonstrates the efficacy of the proposed approach.