Abstract:We propose a novel Conditional Latent space Variational Autoencoder (CL-VAE) to perform improved pre-processing for anomaly detection on data with known inlier classes and unknown outlier classes. This proposed variational autoencoder (VAE) improves latent space separation by conditioning on information within the data. The method fits a unique prior distribution to each class in the dataset, effectively expanding the classic prior distribution for VAEs to include a Gaussian mixture model. An ensemble of these VAEs are merged in the latent spaces to form a group consensus that greatly improves the accuracy of anomaly detection across data sets. Our approach is compared against the capabilities of a typical VAE, a CNN, and a PCA, with regards AUC for anomaly detection. The proposed model shows increased accuracy in anomaly detection, achieving an AUC of 97.4% on the MNIST dataset compared to 95.7% for the second best model. In addition, the CL-VAE shows increased benefits from ensembling, a more interpretable latent space, and an increased ability to learn patterns in complex data with limited model sizes.
Abstract:We propose a methodology to enhance local CO2 monitoring by integrating satellite data from the Orbiting Carbon Observatories (OCO-2 and OCO-3) with ground level observations from the Integrated Carbon Observation System (ICOS) and weather data from the ECMWF Reanalysis v5 (ERA5). Unlike traditional methods that downsample national data, our approach uses multimodal data fusion for high-resolution CO2 estimations. We employ weighted K-nearest neighbor (KNN) interpolation with machine learning models to predict ground level CO2 from satellite measurements, achieving a Root Mean Squared Error of 3.92 ppm. Our results show the effectiveness of integrating diverse data sources in capturing local emission patterns, highlighting the value of high-resolution atmospheric transport models. The developed model improves the granularity of CO2 monitoring, providing precise insights for targeted carbon mitigation strategies, and represents a novel application of neural networks and KNN in environmental monitoring, adaptable to various regions and temporal scales.