Abstract:We study the problem of propagating the mean and covariance of a general multivariate Gaussian distribution through a deep (residual) neural network using layer-by-layer moment matching. We close a longstanding gap by deriving exact moment matching for the probit, GeLU, ReLU (as a limit of GeLU), Heaviside (as a limit of probit), and sine activation functions; for both feedforward and generalized residual layers. On random networks, we find orders-of-magnitude improvements in the KL divergence error metric, up to a millionfold, over popular alternatives. On real data, we find competitive statistical calibration for inference under epistemic uncertainty in the input. On a variational Bayes network, we show that our method attains hundredfold improvements in KL divergence from Monte Carlo ground truth over a state-of-the-art deterministic inference method. We also give an a priori error bound and a preliminary analysis of stochastic feedforward neurons, which have recently attracted general interest.




Abstract:The Kalman filter and Rauch-Tung-Striebel (RTS) smoother are optimal for state estimation in linear dynamic systems. With nonlinear systems, the challenge consists in how to propagate uncertainty through the state transitions and output function. For the case of a neural network model, we enable accurate uncertainty propagation using a recent state-of-the-art analytic formula for computing the mean and covariance of a deep neural network with Gaussian input. We argue that cross entropy is a more appropriate performance metric than RMSE for evaluating the accuracy of filters and smoothers. We demonstrate the superiority of our method for state estimation on a stochastic Lorenz system and a Wiener system, and find that our method enables more optimal linear quadratic regulation when the state estimate is used for feedback.
Abstract:When the physics is wrong, physics-informed machine learning becomes physics-misinformed machine learning. A powerful black-box model should not be able to conceal misconceived physics. We propose two criteria that can be used to assert integrity that a hybrid (physics plus black-box) model: 0) the black-box model should be unable to replicate the physical model, and 1) any best-fit hybrid model has the same physical parameter as a best-fit standalone physics model. We demonstrate them for a sample nonlinear mechanical system approximated by its small-signal linearization.
Abstract:Differentiating noisy, discrete measurements in order to fit an ordinary differential equation can be unreasonably effective. Assuming square-integrable noise and minimal flow regularity, we construct and analyze a finite-difference differentiation filter and a Tikhonov-regularized least squares estimator for the continuous-time parameter-linear system. Combining these contributions in series, we obtain a finite-sample bound on mean absolute error of estimation. As a by-product, we offer a novel analysis of stochastically perturbed Moore-Penrose pseudoinverses.