Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuedong He

Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator

Jun 08, 2024

Wenhao Xu, Xuefeng Gao, Xuedong He

Abstract:Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting. We propose a simple least-squares greedy algorithm and show that it achieves $\widetilde{\mathcal{O}}(\log N)$ regret under a specific identifiability assumption, where $N$ is the total number of episodes. If the identifiability assumption is not satisfied, we propose incorporating exploration noise into the least-squares-based algorithm, resulting in an algorithm with $\widetilde{\mathcal{O}}(\sqrt{N})$ regret. To our best knowledge, this is the first set of regret bounds for episodic risk-sensitive linear quadratic regulator. Our proof relies on perturbation analysis of less-standard Riccati equations for risk-sensitive linear quadratic control, and a delicate analysis of the loss in the risk-sensitive performance criterion due to applying the suboptimal controller in the online learning process.

Via

Access Paper or Ask Questions

Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents

Jan 30, 2023

Wenhao Xu, Xuefeng Gao, Xuedong He

Abstract:The optimized certainty equivalent (OCE) is a family of risk measures that cover important examples such as entropic risk, conditional value-at-risk and mean-variance models. In this paper, we propose a new episodic risk-sensitive reinforcement learning formulation based on tabular Markov decision processes with recursive OCEs. We design an efficient learning algorithm for this problem based on value iteration and upper confidence bound. We derive an upper bound on the regret of the proposed algorithm, and also establish a minimax lower bound. Our bounds show that the regret rate achieved by our proposed algorithm has optimal dependence on the number of episodes and the number of actions.

Via

Access Paper or Ask Questions

Improving Model Drift for Robust Object Tracking

Dec 02, 2019

Qiujie Dong, Xuedong He, Haiyan Ge, Qin Liu, Aifu Han, Shengzong Zhou

Figure 1 for Improving Model Drift for Robust Object Tracking

Figure 2 for Improving Model Drift for Robust Object Tracking

Figure 3 for Improving Model Drift for Robust Object Tracking

Figure 4 for Improving Model Drift for Robust Object Tracking

Abstract:Discriminative correlation filters show excellent performance in object tracking. However, in complex scenes, the apparent characteristics of the tracked target are variable, which makes it easy to pollute the model and cause the model drift. In this paper, considering that the secondary peak has a greater impact on the model update, we propose a method for detecting the primary and secondary peaks of the response map. Secondly, a novel confidence function which uses the adaptive update discriminant mechanism is proposed, which yield good robustness. Thirdly, we propose a robust tracker with correlation filters, which uses hand-crafted features and can improve model drift in complex scenes. Finally, in order to cope with the current trackers' multi-feature response merge, we propose a simple exponential adaptive merge approach. Extensive experiments are performed on OTB2013, OTB100 and TC128 datasets. Our approach performs superiorly against several state-of-the-art trackers while runs at speed in real time.

* 7 pages, 6 figures, 4 tables

Via

Access Paper or Ask Questions