Abstract:In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by a second phase of an interplay between naive exploration and control in an episodic fashion. We show that LQG-NAIVE achieves a regret growth rate of $\tilde{\mathcal{O}}(\sqrt{T})$, i.e., $\mathcal{O}(\sqrt{T})$ up to logarithmic factors after $T$ time steps, and we validate its performance through numerical simulations. Additionally, we propose LQG-IF2E, which extends the exploration signal to a `closed-loop' setting by incorporating the Fisher Information Matrix (FIM). We provide compelling numerical evidence of the competitive performance of LQG-IF2E compared to LQG-NAIVE.
Abstract:For a receding-horizon controller with a known system and with an approximate terminal value function, it is well-known that increasing the prediction horizon can improve its control performance. However, when the prediction model is inexact, a larger prediction horizon also causes propagation and accumulation of the prediction error. In this work, we aim to analyze the effect of the above trade-off between the modeling error, the terminal value function error, and the prediction horizon on the performance of a nominal receding-horizon linear quadratic (LQ) controller. By developing a novel perturbation result of the Riccati difference equation, a performance upper bound is obtained and suggests that for many cases, the prediction horizon should be either 1 or infinity to improve the control performance, depending on the relative difference between the modeling error and the terminal value function error. The obtained suboptimality performance bound is also applied to provide end-to-end performance guarantees, e.g., regret bounds, for nominal receding-horizon LQ controllers in a learning-based setting.
Abstract:This work aims to derive a data-independent finite-sample error bound for the least-squares (LS) estimation error of switched linear systems when the state and the switching signal are measured. While the existing finite-sample bounds for linear system identification extend to the problem under consideration, the Gramian of the switched system, an essential term in the error bound, depends on the measured switching signal. Therefore, data-independent bounds on the spectrum of the Gramian are developed for globally asymptotically and marginally stable switched systems when the switching is arbitrary or subject to an average dwell time constraint. Combining the bounds on the spectrum of the Gramian and the preliminary error bound extended from linear system identification leads to the error bound for the LS estimate of the switched system.