Abstract:Nonlinear system design is often a multi-objective optimization problem involving search for a design that satisfies a number of predefined constraints. The design space is typically very large since it includes all possible system architectures with different combinations of components composing each architecture. In this article, we address nonlinear system design space exploration through a two-step approach encapsulated in a framework called Fast Design Space Exploration of Nonlinear Systems (ASSENT). In the first step, we use a genetic algorithm to search for system architectures that allow discrete choices for component values or else only component values for a fixed architecture. This step yields a coarse design since the system may or may not meet the target specifications. In the second step, we use an inverse design to search over a continuous space and fine-tune the component values with the goal of improving the value of the objective function. We use a neural network to model the system response. The neural network is converted into a mixed-integer linear program for active learning to sample component values efficiently. We illustrate the efficacy of ASSENT on problems ranging from nonlinear system design to design of electrical circuits. Experimental results show that ASSENT achieves the same or better value of the objective function compared to various other optimization techniques for nonlinear system design by up to 54%. We improve sample efficiency by 6-10x compared to reinforcement learning based synthesis of electrical circuits.
Abstract:Design of cyber-physical systems (CPSs) is a challenging task that involves searching over a large search space of various CPS configurations and possible values of components composing the system. Hence, there is a need for sample-efficient CPS design space exploration to select the system architecture and component values that meet the target system requirements. We address this challenge by formulating CPS design as a multi-objective optimization problem and propose DISPATCH, a two-step methodology for sample-efficient search over the design space. First, we use a genetic algorithm to search over discrete choices of system component values for architecture search and component selection or only component selection and terminate the algorithm even before meeting the system requirements, thus yielding a coarse design. In the second step, we use an inverse design to search over a continuous space to fine-tune the component values and meet the diverse set of system requirements. We use a neural network as a surrogate function for the inverse design of the system. The neural network, converted into a mixed-integer linear program, is used for active learning to sample component values efficiently in a continuous search space. We illustrate the efficacy of DISPATCH on electrical circuit benchmarks: two-stage and three-stage transimpedence amplifiers. Simulation results show that the proposed methodology improves sample efficiency by 5-14x compared to a prior synthesis method that relies on reinforcement learning. It also synthesizes circuits with the best performance (highest bandwidth/lowest area) compared to designs synthesized using reinforcement learning, Bayesian optimization, or humans.
Abstract:In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic utility are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The game is analyzed and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the first case, the gradients are computed exactly using the model whereas they are estimated using Monte-Carlo simulations in the second case. Numerical experiments show the convergence of the two players' controls as well as the utility function when the two algorithms are used in different scenarios.
Abstract:In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic cost are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of indistinguishable agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The optimality conditions of the game are analysed for both open-loop and closed-loop controls, and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the model-based case, the gradients are computed exactly using the model, whereas they are estimated using Monte-Carlo simulations in the sample-based case. Numerical experiments are conducted to show the convergence of the utility function as well as the two players' controls.
Abstract:In this paper, the problems of user offloading and resource optimization are jointly addressed to support ultra-reliable and low latency communications (URLLC) in HetNets. In particular, a multi-tier network with a single macro base station (MBS) and multiple overlaid small cell base stations (SBSs) is considered that includes users with different latency and reliability constraints. Modeling the latency and reliability constraints of users with probabilistic guarantees, the joint problem of user offloading and resource allocation (JUR) in a URLLC setting is formulated as an optimization problem to minimize the cost of serving users for the MBS. In the considered scheme, SBSs bid to serve URLLC users under their coverage at a given price, and the MBS decides whether to serve each user locally or to offload it to one of the overlaid SBSs. Since the JUR optimization is NP-hard, we propose a low complexity learning-based heuristic method (LHM) which includes a support vector machine-based user association model and a convex resource optimization (CRO) algorithm. To further reduce the delay, we propose an alternating direction method of multipliers (ADMM)-based solution to the CRO problem. Simulation results show that using LHM, the MBS significantly decreases the spectrum access delay for users (by $\sim$ 93\%) as compared to JUR, while also reducing its bandwidth and power costs in serving users (by $\sim$ 33\%) as compared to directly serving users without offloading.