Abstract:Stochastic nonlinear dynamical systems are ubiquitous in modern, real-world applications. Yet, estimating the unknown parameters of stochastic, nonlinear dynamical models remains a challenging problem. The majority of existing methods employ maximum likelihood or Bayesian estimation. However, these methods suffer from some limitations, most notably the substantial computational time for inference coupled with limited flexibility in application. In this work, we propose DeepBayes estimators that leverage the power of deep recurrent neural networks in learning an estimator. The method consists of first training a recurrent neural network to minimize the mean-squared estimation error over a set of synthetically generated data using models drawn from the model set of interest. The a priori trained estimator can then be used directly for inference by evaluating the network with the estimation data. The deep recurrent neural network architectures can be trained offline and ensure significant time savings during inference. We experiment with two popular recurrent neural networks -- long short term memory network (LSTM) and gated recurrent unit (GRU). We demonstrate the applicability of our proposed method on different example models and perform detailed comparisons with state-of-the-art approaches. We also provide a study on a real-world nonlinear benchmark problem. The experimental evaluations show that the proposed approach is asymptotically as good as the Bayes estimator.
Abstract:We address the issue of estimating the topology and dynamics of sparse linear dynamic networks in a hyperparameter-free setting. We propose a method to estimate the network dynamics in a computationally efficient and parameter tuning-free iterative framework known as SPICE (Sparse Iterative Covariance Estimation). The estimated dynamics directly reveal the underlying topology. Our approach does not assume that the network is undirected and is applicable even with varying noise levels across the modules of the network. We also do not assume any explicit prior knowledge on the network dynamics. Numerical experiments with realistic dynamic networks illustrate the usefulness of our method.
Abstract:This paper concerns the problem of learning control policies for an unknown linear dynamical system to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task robustly: i.e., we minimize the worst-case cost, accounting for system uncertainty given the observed data. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.
Abstract:In this paper we introduce a novel method for linear system identification with quantized output data. We model the impulse response as a zero-mean Gaussian process whose covariance (kernel) is given by the recently proposed stable spline kernel, which encodes information on regularity and exponential stability. This serves as a starting point to cast our system identification problem into a Bayesian framework. We employ Markov Chain Monte Carlo methods to provide an estimate of the system. In particular, we design two methods based on the so-called Gibbs sampler that allow also to estimate the kernel hyperparameters by marginal likelihood maximization via the expectation-maximization method. Numerical simulations show the effectiveness of the proposed scheme, as compared to the state-of-the-art kernel-based methods when these are employed in system identification with quantized data.
Abstract:We propose a new method for blind system identification. Resorting to a Gaussian regression framework, we model the impulse response of the unknown linear system as a realization of a Gaussian process. The structure of the covariance matrix (or kernel) of such a process is given by the stable spline kernel, which has been recently introduced for system identification purposes and depends on an unknown hyperparameter. We assume that the input can be linearly described by few parameters. We estimate these parameters, together with the kernel hyperparameter and the noise variance, using an empirical Bayes approach. The related optimization problem is efficiently solved with a novel iterative scheme based on the Expectation-Maximization method. In particular, we show that each iteration consists of a set of simple update rules. We show, through some numerical experiments, very promising performance of the proposed method.
Abstract:Recent developments in system identification have brought attention to regularized kernel-based methods, where, adopting the recently introduced stable spline kernel, prior information on the unknown process is enforced. This reduces the variance of the estimates and thus makes kernel-based methods particularly attractive when few input-output data samples are available. In such cases however, the influence of the system initial conditions may have a significant impact on the output dynamics. In this paper, we specifically address this point. We propose three methods that deal with the estimation of initial conditions using different types of information. The methods consist in various mixed maximum likelihood--a posteriori estimators which estimate the initial conditions and tune the hyperparameters characterizing the stable spline kernel. To solve the related optimization problems, we resort to the expectation-maximization method, showing that the solutions can be attained by iterating among simple update steps. Numerical experiments show the advantages, in terms of accuracy in reconstructing the system impulse response, of the proposed strategies, compared to other kernel-based schemes not accounting for the effect initial conditions.
Abstract:In this paper we propose a new identification scheme for Hammerstein systems, which are dynamic systems consisting of a static nonlinearity and a linear time-invariant dynamic system in cascade. We assume that the nonlinear function can be described as a linear combination of $p$ basis functions. We reconstruct the $p$ coefficients of the nonlinearity together with the first $n$ samples of the impulse response of the linear system by estimating an $np$-dimensional overparameterized vector, which contains all the combinations of the unknown variables. To avoid high variance in these estimates, we adopt a regularized kernel-based approach and, in particular, we introduce a new kernel tailored for Hammerstein system identification. We show that the resulting scheme provides an estimate of the overparameterized vector that can be uniquely decomposed as the combination of an impulse response and $p$ coefficients of the static nonlinearity. We also show, through several numerical experiments, that the proposed method compares very favorably with two standard methods for Hammerstein system identification.
Abstract:Recent developments in system identification have brought attention to regularized kernel-based methods. This type of approach has been proven to compare favorably with classic parametric methods. However, current formulations are not robust with respect to outliers. In this paper, we introduce a novel method to robustify kernel-based system identification methods. To this end, we model the output measurement noise using random variables with heavy-tailed probability density functions (pdfs), focusing on the Laplacian and the Student's t distributions. Exploiting the representation of these pdfs as scale mixtures of Gaussians, we cast our system identification problem into a Gaussian process regression framework, which requires estimating a number of hyperparameters of the data size order. To overcome this difficulty, we design a new maximum a posteriori (MAP) estimator of the hyperparameters, and solve the related optimization problem with a novel iterative scheme based on the Expectation-Maximization (EM) method. In presence of outliers, tests on simulated data and on a real system show a substantial performance improvement compared to currently used kernel-based methods for linear system identification.
Abstract:In this paper we introduce a novel method for linear system identification with quantized output data. We model the impulse response as a zero-mean Gaussian process whose covariance (kernel) is given by the recently proposed stable spline kernel, which encodes information on regularity and exponential stability. This serves as a starting point to cast our system identification problem into a Bayesian framework. We employ Markov Chain Monte Carlo (MCMC) methods to provide an estimate of the system. In particular, we show how to design a Gibbs sampler which quickly converges to the target distribution. Numerical simulations show a substantial improvement in the accuracy of the estimates over state-of-the-art kernel-based methods when employed in identification of systems with quantized data.
Abstract:In this article, we analyze the SPICE method developed in [1], and establish its connections with other standard sparse estimation methods such as the Lasso and the LAD-Lasso. This result positions SPICE as a computationally efficient technique for the calculation of Lasso-type estimators. Conversely, this connection is very useful for establishing the asymptotic properties of SPICE under several problem scenarios and for suggesting suitable modifications in cases where the naive version of SPICE would not work.