Abstract:Soil and groundwater contamination is a pervasive problem at thousands of locations across the world. Contaminated sites often require decades to remediate or to monitor natural attenuation. Climate change exacerbates the long-term site management problem because extreme precipitation and/or shifts in precipitation/evapotranspiration regimes could re-mobilize contaminants and proliferate affected groundwater. To quickly assess the spatiotemporal variations of groundwater contamination under uncertain climate disturbances, we developed a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Operator (U-FNO) to solve Partial Differential Equations (PDEs) of groundwater flow and transport simulations at the site scale.We develop a combined loss function that includes both data-driven factors and physical boundary constraints at multiple spatiotemporal scales. Our U-FNOs can reliably predict the spatiotemporal variations of groundwater flow and contaminant transport properties from 1954 to 2100 with realistic climate projections. In parallel, we develop a convolutional autoencoder combined with online clustering to reduce the dimensionality of the vast historical and projected climate data by quantifying climatic region similarities across the United States. The ML-based unique climate clusters provide climate projections for the surrogate modeling and help return reliable future recharge rate projections immediately without querying large climate datasets. In all, this Multi-scale Digital Twin work can advance the field of environmental remediation under climate change.
Abstract:Due to the nature of their pathways, NASA Terra and NASA Aqua satellites capture imagery containing swath gaps, which are areas of no data. Swath gaps can overlap the region of interest (ROI) completely, often rendering the entire imagery unusable by Machine Learning (ML) models. This problem is further exacerbated when the ROI rarely occurs (e.g. a hurricane) and, on occurrence, is partially overlapped with a swath gap. With annotated data as supervision, a model can learn to differentiate between the area of focus and the swath gap. However, annotation is expensive and currently the vast majority of existing data is unannotated. Hence, we propose an augmentation technique that considerably removes the existence of swath gaps in order to allow CNNs to focus on the ROI, and thus successfully use data with swath gaps for training. We experiment on the UC Merced Land Use Dataset, where we add swath gaps through empty polygons (up to 20 percent areas) and then apply augmentation techniques to fill the swath gaps. We compare the model trained with our augmentation techniques on the swath gap-filled data with the model trained on the original swath gap-less data and note highly augmented performance. Additionally, we perform a qualitative analysis using activation maps that visualizes the effectiveness of our trained network in not paying attention to the swath gaps. We also evaluate our results with a human baseline and show that, in certain cases, the filled swath gaps look so realistic that even a human evaluator did not distinguish between original satellite images and swath gap-filled images. Since this method is aimed at unlabeled data, it is widely generalizable and impactful for large scale unannotated datasets from various space data domains.