Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mengwu Guo

Physics-based deep kernel learning for parameter estimation in high dimensional PDEs

Sep 17, 2025

Weihao Yan, Christoph Brune, Mengwu Guo

Abstract:Inferring parameters of high-dimensional partial differential equations (PDEs) poses significant computational and inferential challenges, primarily due to the curse of dimensionality and the inherent limitations of traditional numerical methods. This paper introduces a novel two-stage Bayesian framework that synergistically integrates training, physics-based deep kernel learning (DKL) with Hamiltonian Monte Carlo (HMC) to robustly infer unknown PDE parameters and quantify their uncertainties from sparse, exact observations. The first stage leverages physics-based DKL to train a surrogate model, which jointly yields an optimized neural network feature extractor and robust initial estimates for the PDE parameters. In the second stage, with the neural network weights fixed, HMC is employed within a full Bayesian framework to efficiently sample the joint posterior distribution of the kernel hyperparameters and the PDE parameters. Numerical experiments on canonical and high-dimensional inverse PDE problems demonstrate that our framework accurately estimates parameters, provides reliable uncertainty estimates, and effectively addresses challenges of data sparsity and model complexity, offering a robust and scalable tool for diverse scientific and engineering applications.

Via

Access Paper or Ask Questions

PDE-DKL: PDE-constrained deep kernel learning in high dimensionality

Jan 30, 2025

Weihao Yan, Christoph Brune, Mengwu Guo

Figure 1 for PDE-DKL: PDE-constrained deep kernel learning in high dimensionality

Figure 2 for PDE-DKL: PDE-constrained deep kernel learning in high dimensionality

Figure 3 for PDE-DKL: PDE-constrained deep kernel learning in high dimensionality

Figure 4 for PDE-DKL: PDE-constrained deep kernel learning in high dimensionality

Abstract:Many physics-informed machine learning methods for PDE-based problems rely on Gaussian processes (GPs) or neural networks (NNs). However, both face limitations when data are scarce and the dimensionality is high. Although GPs are known for their robust uncertainty quantification in low-dimensional settings, their computational complexity becomes prohibitive as the dimensionality increases. In contrast, while conventional NNs can accommodate high-dimensional input, they often require extensive training data and do not offer uncertainty quantification. To address these challenges, we propose a PDE-constrained Deep Kernel Learning (PDE-DKL) framework that combines DL and GPs under explicit PDE constraints. Specifically, NNs learn a low-dimensional latent representation of the high-dimensional PDE problem, reducing the complexity of the problem. GPs then perform kernel regression subject to the governing PDEs, ensuring accurate solutions and principled uncertainty quantification, even when available data are limited. This synergy unifies the strengths of both NNs and GPs, yielding high accuracy, robust uncertainty estimates, and computational efficiency for high-dimensional PDEs. Numerical experiments demonstrate that PDE-DKL achieves high accuracy with reduced data requirements. They highlight its potential as a practical, reliable, and scalable solver for complex PDE-based applications in science and engineering.

* 22 pages, 9 figures

Via

Access Paper or Ask Questions

HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs

Jan 08, 2025

Nicolò Botteghi, Stefania Fresca, Mengwu Guo, Andrea Manzoni

Figure 1 for HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs

Figure 2 for HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs

Figure 3 for HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs

Figure 4 for HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs

Abstract:In this work, we devise a new, general-purpose reinforcement learning strategy for the optimal control of parametric partial differential equations (PDEs). Such problems frequently arise in applied sciences and engineering and entail a significant complexity when control and/or state variables are distributed in high-dimensional space or depend on varying parameters. Traditional numerical methods, relying on either iterative minimization algorithms or dynamic programming, while reliable, often become computationally infeasible. Indeed, in either way, the optimal control problem must be solved for each instance of the parameters, and this is out of reach when dealing with high-dimensional time-dependent and parametric PDEs. In this paper, we propose HypeRL, a deep reinforcement learning (DRL) framework to overcome the limitations shown by traditional methods. HypeRL aims at approximating the optimal control policy directly. Specifically, we employ an actor-critic DRL approach to learn an optimal feedback control strategy that can generalize across the range of variation of the parameters. To effectively learn such optimal control laws, encoding the parameter information into the DRL policy and value function neural networks (NNs) is essential. To do so, HypeRL uses two additional NNs, often called hypernetworks, to learn the weights and biases of the value function and the policy NNs. We validate the proposed approach on two PDE-constrained optimal control benchmarks, namely a 1D Kuramoto-Sivashinsky equation and a 2D Navier-Stokes equations, by showing that the knowledge of the PDE parameters and how this information is encoded, i.e., via a hypernetwork, is an essential ingredient for learning parameter-dependent control policies that can generalize effectively to unseen scenarios and for improving the sample efficiency of such policies.

Via

Access Paper or Ask Questions

Sparsifying dimensionality reduction of PDE solution data with Bregman learning

Jun 18, 2024

Tjeerd Jan Heeringa, Christoph Brune, Mengwu Guo

Abstract:Classical model reduction techniques project the governing equations onto a linear subspace of the original state space. More recent data-driven techniques use neural networks to enable nonlinear projections. Whilst those often enable stronger compression, they may have redundant parameters and lead to suboptimal latent dimensionality. To overcome these, we propose a multistep algorithm that induces sparsity in the encoder-decoder networks for effective reduction in the number of parameters and additional compression of the latent space. This algorithm starts with sparsely initialized a network and training it using linearized Bregman iterations. These iterations have been very successful in computer vision and compressed sensing tasks, but have not yet been used for reduced-order modelling. After the training, we further compress the latent space dimensionality by using a form of proper orthogonal decomposition. Last, we use a bias propagation technique to change the induced sparsity into an effective reduction of parameters. We apply this algorithm to three representative PDE models: 1D diffusion, 1D advection, and 2D reaction-diffusion. Compared to conventional training methods like Adam, the proposed method achieves similar accuracy with 30% less parameters and a significantly smaller latent space.

* 27 pages, 20 figures, 4 tables

Via

Access Paper or Ask Questions

Recurrent Deep Kernel Learning of Dynamical Systems

May 30, 2024

Nicolò Botteghi, Paolo Motta, Andrea Manzoni, Paolo Zunino, Mengwu Guo

Figure 1 for Recurrent Deep Kernel Learning of Dynamical Systems

Figure 2 for Recurrent Deep Kernel Learning of Dynamical Systems

Figure 3 for Recurrent Deep Kernel Learning of Dynamical Systems

Figure 4 for Recurrent Deep Kernel Learning of Dynamical Systems

Abstract:Digital twins require computationally-efficient reduced-order models (ROMs) that can accurately describe complex dynamics of physical assets. However, constructing ROMs from noisy high-dimensional data is challenging. In this work, we propose a data-driven, non-intrusive method that utilizes stochastic variational deep kernel learning (SVDKL) to discover low-dimensional latent spaces from data and a recurrent version of SVDKL for representing and predicting the evolution of latent dynamics. The proposed method is demonstrated with two challenging examples -- a double pendulum and a reaction-diffusion system. Results show that our framework is capable of (i) denoising and reconstructing measurements, (ii) learning compact representations of system states, (iii) predicting system evolution in low-dimensional latent spaces, and (iv) quantifying modeling uncertainties.

Via

Access Paper or Ask Questions

Gaussian process learning of nonlinear dynamics

Dec 19, 2023

Dongwei Ye, Mengwu Guo

Abstract:One of the pivotal tasks in scientific machine learning is to represent underlying dynamical systems from time series data. Many methods for such dynamics learning explicitly require the derivatives of state data, which are not directly available and can be approximated conventionally by finite differences. However, the discrete approximations of time derivatives may result in a poor estimation when state data are scarce and/or corrupted by noise, thus compromising the predictiveness of the learned dynamical models. To overcome this technical hurdle, we propose a new method that learns nonlinear dynamics through a Bayesian inference of characterizing model parameters. This method leverages a Gaussian process representation of states, and constructs a likelihood function using the correlation between state data and their derivatives, yet prevents explicit evaluations of time derivatives. Through a Bayesian scheme, a probabilistic estimate of the model parameters is given by the posterior distribution, and thus a quantification is facilitated for uncertainties from noisy state data and the learning process. Specifically, we will discuss the applicability of the proposed method to two typical scenarios for dynamical systems: parameter identification and estimation with an affine structure of the system, and nonlinear parametric approximation without prior knowledge.

Via

Access Paper or Ask Questions

Multi-fidelity reduced-order surrogate modeling

Sep 01, 2023

Paolo Conti, Mengwu Guo, Andrea Manzoni, Attilio Frangi, Steven L. Brunton, J. Nathan Kutz

Figure 1 for Multi-fidelity reduced-order surrogate modeling

Figure 2 for Multi-fidelity reduced-order surrogate modeling

Figure 3 for Multi-fidelity reduced-order surrogate modeling

Figure 4 for Multi-fidelity reduced-order surrogate modeling

Abstract:High-fidelity numerical simulations of partial differential equations (PDEs) given a restricted computational budget can significantly limit the number of parameter configurations considered and/or time window evaluated for modeling a given system. Multi-fidelity surrogate modeling aims to leverage less accurate, lower-fidelity models that are computationally inexpensive in order to enhance predictive accuracy when high-fidelity data are limited or scarce. However, low-fidelity models, while often displaying important qualitative spatio-temporal features, fail to accurately capture the onset of instability and critical transients observed in the high-fidelity models, making them impractical as surrogate models. To address this shortcoming, we present a new data-driven strategy that combines dimensionality reduction with multi-fidelity neural network surrogates. The key idea is to generate a spatial basis by applying the classical proper orthogonal decomposition (POD) to high-fidelity solution snapshots, and approximate the dynamics of the reduced states - time-parameter-dependent expansion coefficients of the POD basis - using a multi-fidelity long-short term memory (LSTM) network. By mapping low-fidelity reduced states to their high-fidelity counterpart, the proposed reduced-order surrogate model enables the efficient recovery of full solution fields over time and parameter variations in a non-intrusive manner. The generality and robustness of this method is demonstrated by a collection of parametrized, time-dependent PDE problems where the low-fidelity model can be defined by coarser meshes and/or time stepping, as well as by misspecified physical features. Importantly, the onset of instabilities and transients are well captured by this surrogate modeling technique.

Via

Access Paper or Ask Questions

Bayesian approach to Gaussian process regression with uncertain inputs

May 19, 2023

Dongwei Ye, Mengwu Guo

Figure 1 for Bayesian approach to Gaussian process regression with uncertain inputs

Figure 2 for Bayesian approach to Gaussian process regression with uncertain inputs

Figure 3 for Bayesian approach to Gaussian process regression with uncertain inputs

Figure 4 for Bayesian approach to Gaussian process regression with uncertain inputs

Abstract:Conventional Gaussian process regression exclusively assumes the existence of noise in the output data of model observations. In many scientific and engineering applications, however, the input locations of observational data may also be compromised with uncertainties owing to modeling assumptions, measurement errors, etc. In this work, we propose a Bayesian method that integrates the variability of input data into Gaussian process regression. Considering two types of observables -- noise-corrupted outputs with fixed inputs and those with prior-distribution-defined uncertain inputs, a posterior distribution is estimated via a Bayesian framework to infer the uncertain data locations. Thereafter, such quantified uncertainties of inputs are incorporated into Gaussian process predictions by means of marginalization. The effectiveness of this new regression technique is demonstrated through several numerical examples, in which a consistently good performance of generalization is observed, while a substantial reduction in the predictive uncertainties is achieved by the Bayesian inference of uncertain inputs.

Via

Access Paper or Ask Questions

Deep Kernel Learning of Dynamical Models from High-Dimensional Noisy Data

Aug 27, 2022

Nicolò Botteghi, Mengwu Guo, Christoph Brune

Figure 1 for Deep Kernel Learning of Dynamical Models from High-Dimensional Noisy Data

Figure 2 for Deep Kernel Learning of Dynamical Models from High-Dimensional Noisy Data

Figure 3 for Deep Kernel Learning of Dynamical Models from High-Dimensional Noisy Data

Figure 4 for Deep Kernel Learning of Dynamical Models from High-Dimensional Noisy Data

Abstract:This work proposes a Stochastic Variational Deep Kernel Learning method for the data-driven discovery of low-dimensional dynamical models from high-dimensional noisy data. The framework is composed of an encoder that compresses high-dimensional measurements into low-dimensional state variables, and a latent dynamical model for the state variables that predicts the system evolution over time. The training of the proposed model is carried out in an unsupervised manner, i.e., not relying on labeled data. Our learning method is evaluated on the motion of a pendulum -- a well studied baseline for nonlinear model identification and control with continuous states and control inputs -- measured via high-dimensional noisy RGB images. Results show that the method can effectively denoise measurements, learn compact state representations and latent dynamical models, as well as identify and quantify modeling uncertainties.

Via

Access Paper or Ask Questions

Multi-fidelity surrogate modeling using long short-term memory networks

Aug 05, 2022

Paolo Conti, Mengwu Guo, Andrea Manzoni, Jan S. Hesthaven

Figure 1 for Multi-fidelity surrogate modeling using long short-term memory networks

Figure 2 for Multi-fidelity surrogate modeling using long short-term memory networks

Figure 3 for Multi-fidelity surrogate modeling using long short-term memory networks

Figure 4 for Multi-fidelity surrogate modeling using long short-term memory networks

Abstract:When evaluating quantities of interest that depend on the solutions to differential equations, we inevitably face the trade-off between accuracy and efficiency. Especially for parametrized, time dependent problems in engineering computations, it is often the case that acceptable computational budgets limit the availability of high-fidelity, accurate simulation data. Multi-fidelity surrogate modeling has emerged as an effective strategy to overcome this difficulty. Its key idea is to leverage many low-fidelity simulation data, less accurate but much faster to compute, to improve the approximations with limited high-fidelity data. In this work, we introduce a novel data-driven framework of multi-fidelity surrogate modeling for parametrized, time-dependent problems using long short-term memory (LSTM) networks, to enhance output predictions both for unseen parameter values and forward in time simultaneously - a task known to be particularly challenging for data-driven models. We demonstrate the wide applicability of the proposed approaches in a variety of engineering problems with high- and low-fidelity data generated through fine versus coarse meshes, small versus large time steps, or finite element full-order versus deep learning reduced-order models. Numerical results show that the proposed multi-fidelity LSTM networks not only improve single-fidelity regression significantly, but also outperform the multi-fidelity models based on feed-forward neural networks.

Via

Access Paper or Ask Questions