Abstract:Seawater intrusion into coastal aquifers poses a significant threat to groundwater resources, especially with rising sea levels due to climate change. Accurate modeling and uncertainty quantification of this process are crucial but are often hindered by the high computational costs of traditional numerical simulations. In this work, we develop GeoFUSE, a novel deep-learning-based surrogate framework that integrates the U-Net Fourier Neural Operator (U-FNO) with Principal Component Analysis (PCA) and Ensemble Smoother with Multiple Data Assimilation (ESMDA). GeoFUSE enables fast and efficient simulation of seawater intrusion while significantly reducing uncertainty in model predictions. We apply GeoFUSE to a 2D cross-section of the Beaver Creek tidal stream-floodplain system in Washington State. Using 1,500 geological realizations, we train the U-FNO surrogate model to approximate salinity distribution and accumulation. The U-FNO model successfully reduces the computational time from hours (using PFLOTRAN simulations) to seconds, achieving a speedup of approximately 360,000 times while maintaining high accuracy. By integrating measurement data from monitoring wells, the framework significantly reduces geological uncertainty and improves the predictive accuracy of the salinity distribution over a 20-year period. Our results demonstrate that GeoFUSE improves computational efficiency and provides a robust tool for real-time uncertainty quantification and decision making in groundwater management. Future work will extend GeoFUSE to 3D models and incorporate additional factors such as sea-level rise and extreme weather events, making it applicable to a broader range of coastal and subsurface flow systems.
Abstract:History matching based on monitoring data will enable uncertainty reduction, and thus improved aquifer management, in industrial-scale carbon storage operations. In traditional model-based data assimilation, geomodel parameters are modified to force agreement between flow simulation results and observations. In data-space inversion (DSI), history-matched quantities of interest, e.g., posterior pressure and saturation fields conditioned to observations, are inferred directly, without constructing posterior geomodels. This is accomplished efficiently using a set of O(1000) prior simulation results, data parameterization, and posterior sampling within a Bayesian setting. In this study, we develop and implement (in DSI) a deep-learning-based parameterization to represent spatio-temporal pressure and CO2 saturation fields at a set of time steps. The new parameterization uses an adversarial autoencoder (AAE) for dimension reduction and a convolutional long short-term memory (convLSTM) network to represent the spatial distribution and temporal evolution of the pressure and saturation fields. This parameterization is used with an ensemble smoother with multiple data assimilation (ESMDA) in the DSI framework to enable posterior predictions. A realistic 3D system characterized by prior geological realizations drawn from a range of geological scenarios is considered. A local grid refinement procedure is introduced to estimate the error covariance term that appears in the history matching formulation. Extensive history matching results are presented for various quantities, for multiple synthetic true models. Substantial uncertainty reduction in posterior pressure and saturation fields is achieved in all cases. The framework is applied to efficiently provide posterior predictions for a range of error covariance specifications. Such an assessment would be expensive using a model-based approach.
Abstract:Deep-learning-based surrogate models show great promise for use in geological carbon storage operations. In this work we target an important application - the history matching of storage systems characterized by a high degree of (prior) geological uncertainty. Toward this goal, we extend the recently introduced recurrent R-U-Net surrogate model to treat geomodel realizations drawn from a wide range of geological scenarios. These scenarios are defined by a set of metaparameters, which include the mean and standard deviation of log-permeability, permeability anisotropy ratio, horizontal correlation length, etc. An infinite number of realizations can be generated for each set of metaparameters, so the range of prior uncertainty is large. The surrogate model is trained with flow simulation results, generated using the open-source simulator GEOS, for 2000 random realizations. The flow problems involve four wells, each injecting 1 Mt CO2/year, for 30 years. The trained surrogate model is shown to provide accurate predictions for new realizations over the full range of geological scenarios, with median relative error of 1.3% in pressure and 4.5% in saturation. The surrogate model is incorporated into a Markov chain Monte Carlo history matching workflow, where the goal is to generate history matched realizations and posterior estimates of the metaparameters. We show that, using observed data from monitoring wells in synthetic `true' models, geological uncertainty is reduced substantially. This leads to posterior 3D pressure and saturation fields that display much closer agreement with the true-model responses than do prior predictions.
Abstract:Data assimilation presents computational challenges because many high-fidelity models must be simulated. Various deep-learning-based surrogate modeling techniques have been developed to reduce the simulation costs associated with these applications. However, to construct data-driven surrogate models, several thousand high-fidelity simulation runs may be required to provide training samples, and these computations can make training prohibitively expensive. To address this issue, in this work we present a framework where most of the training simulations are performed on coarsened geomodels. These models are constructed using a flow-based upscaling method. The framework entails the use of a transfer-learning procedure, incorporated within an existing recurrent residual U-Net architecture, in which network training is accomplished in three steps. In the first step. where the bulk of the training is performed, only low-fidelity simulation results are used. The second and third steps, in which the output layer is trained and the overall network is fine-tuned, require a relatively small number of high-fidelity simulations. Here we use 2500 low-fidelity runs and 200 high-fidelity runs, which leads to about a 90% reduction in training simulation costs. The method is applied for two-phase subsurface flow in 3D channelized systems, with flow driven by wells. The surrogate model trained with multifidelity data is shown to be nearly as accurate as a reference surrogate trained with only high-fidelity data in predicting dynamic pressure and saturation fields in new geomodels. Importantly, the network provides results that are significantly more accurate than the low-fidelity simulations used for most of the training. The multifidelity surrogate is also applied for history matching using an ensemble-based procedure, where accuracy relative to reference results is again demonstrated.
Abstract:Fast forecasting of reservoir pressure distribution in geologic carbon storage (GCS) by assimilating monitoring data is a challenging problem. Due to high drilling cost, GCS projects usually have spatially sparse measurements from wells, leading to high uncertainties in reservoir pressure prediction. To address this challenge, we propose to use low-cost Interferometric Synthetic-Aperture Radar (InSAR) data as monitoring data to infer reservoir pressure build up. We develop a deep learning-accelerated workflow to assimilate surface displacement maps interpreted from InSAR and to forecast dynamic reservoir pressure. Employing an Ensemble Smoother Multiple Data Assimilation (ES-MDA) framework, the workflow updates three-dimensional (3D) geologic properties and predicts reservoir pressure with quantified uncertainties. We use a synthetic commercial-scale GCS model with bimodally distributed permeability and porosity to demonstrate the efficacy of the workflow. A two-step CNN-PCA approach is employed to parameterize the bimodal fields. The computational efficiency of the workflow is boosted by two residual U-Net based surrogate models for surface displacement and reservoir pressure predictions, respectively. The workflow can complete data assimilation and reservoir pressure forecasting in half an hour on a personal computer.
Abstract:Data-space inversion (DSI) and related procedures represent a family of methods applicable for data assimilation in subsurface flow settings. These methods differ from model-based techniques in that they provide only posterior predictions for quantities (time series) of interest, not posterior models with calibrated parameters. DSI methods require a large number of flow simulations to first be performed on prior geological realizations. Given observed data, posterior predictions can then be generated directly. DSI operates in a Bayesian setting and provides posterior samples of the data vector. In this work we develop and evaluate a new approach for data parameterization in DSI. Parameterization reduces the number of variables to determine in the inversion, and it maintains the physical character of the data variables. The new parameterization uses a recurrent autoencoder (RAE) for dimension reduction, and a long-short-term memory (LSTM) network to represent flow-rate time series. The RAE-based parameterization is combined with an ensemble smoother with multiple data assimilation (ESMDA) for posterior generation. Results are presented for two- and three-phase flow in a 2D channelized system and a 3D multi-Gaussian model. The RAE procedure, along with existing DSI treatments, are assessed through comparison to reference rejection sampling (RS) results. The new DSI methodology is shown to consistently outperform existing approaches, in terms of statistical agreement with RS results. The method is also shown to accurately capture derived quantities, which are computed from variables considered directly in DSI. This requires correlation and covariance between variables to be properly captured, and accuracy in these relationships is demonstrated. The RAE-based parameterization developed here is clearly useful in DSI, and it may also find application in other subsurface flow problems.