Abstract:Bayesian optimization (BO) is a widely used method for data-driven optimization that generally relies on zeroth-order data of objective function to construct probabilistic surrogate models. These surrogates guide the exploration-exploitation process toward finding global optimum. While Gaussian processes (GPs) are commonly employed as surrogates of the unknown objective function, recent studies have highlighted the potential of Bayesian neural networks (BNNs) as scalable and flexible alternatives. Moreover, incorporating gradient observations into GPs, when available, has been shown to improve BO performance. However, the use of gradients within BNN surrogates remains unexplored. By leveraging automatic differentiation, gradient information can be seamlessly integrated into BNN training, resulting in more informative surrogates for BO. We propose a gradient-informed loss function for BNN training, effectively augmenting function observations with local gradient information. The effectiveness of this approach is demonstrated on well-known benchmarks in terms of improved BNN predictions and faster BO convergence as the number of decision variables increases.
Abstract:Incorporating user preferences into multi-objective Bayesian optimization (MOBO) allows for personalization of the optimization procedure. Preferences are often abstracted in the form of an unknown utility function, estimated through pairwise comparisons of potential outcomes. However, utility-driven MOBO methods can yield solutions that are dominated by nearby solutions, as non-dominance is not enforced. Additionally, classical MOBO commonly relies on estimating the entire Pareto-front to identify the Pareto-optimal solutions, which can be expensive and ignore user preferences. Here, we present a new method, termed preference-utility-balanced MOBO (PUB-MOBO), that allows users to disambiguate between near-Pareto candidate solutions. PUB-MOBO combines utility-based MOBO with local multi-gradient descent to refine user-preferred solutions to be near-Pareto-optimal. To this end, we propose a novel preference-dominated utility function that concurrently preserves user-preferences and dominance amongst candidate solutions. A key advantage of PUB-MOBO is that the local search is restricted to a (small) region of the Pareto-front directed by user preferences, alleviating the need to estimate the entire Pareto-front. PUB-MOBO is tested on three synthetic benchmark problems: DTLZ1, DTLZ2 and DH1, as well as on three real-world problems: Vehicle Safety, Conceptual Marine Design, and Car Side Impact. PUB-MOBO consistently outperforms state-of-the-art competitors in terms of proximity to the Pareto-front and utility regret across all the problems.
Abstract:Deep neural networks are increasingly used as an effective way to represent control policies in a wide-range of learning-based control methods. For continuous-time optimal control problems (OCPs), which are central to many decision-making tasks, control policy learning can be cast as a neural ordinary differential equation (NODE) problem wherein state and control constraints are naturally accommodated. This paper presents a Lyapunov-NODE control (L-NODEC) approach to solving continuous-time OCPs for the case of stabilizing a known constrained nonlinear system around a terminal equilibrium point. We propose a Lyapunov loss formulation that incorporates a control-theoretic Lyapunov condition into the problem of learning a state-feedback neural control policy. We establish that L-NODEC ensures exponential stability of the controlled system, as well as its adversarial robustness to uncertain initial conditions. The performance of L-NODEC is illustrated on a benchmark double integrator problem and for optimal control of thermal dose delivery using a cold atmospheric plasma biomedical system. L-NODEC can substantially reduce the inference time necessary to reach the equilibrium state.