Abstract:Access to granular demand data is essential for the net zero transition; it allows for accurate profiling and active demand management as our reliance on variable renewable generation increases. However, public release of this data is often impossible due to privacy concerns. Good quality synthetic data can circumnavigate this issue. Despite significant research on generating synthetic smart meter data, there is still insufficient work on creating a consistent evaluation framework. In this paper, we investigate how common frameworks used by other industries leveraging synthetic data, can be applied to synthetic smart meter data, such as fidelity, utility and privacy. We also recommend specific metrics to ensure that defining aspects of smart meter data are preserved and test the extent to which privacy can be protected using differential privacy. We show that standard privacy attack methods like reconstruction or membership inference attacks are inadequate for assessing privacy risks of smart meter datasets. We propose an improved method by injecting training data with implausible outliers, then launching privacy attacks directly on these outliers. The choice of $\epsilon$ (a metric of privacy loss) significantly impacts privacy risk, highlighting the necessity of performing these explicit privacy tests when making trade-offs between fidelity and privacy.
Abstract:In this paper, an innovative multi-modal deep learning model is proposed to deeply integrate heterogeneous information from medical images and clinical reports. First, for medical images, convolutional neural networks were used to extract high-dimensional features and capture key visual information such as focal details, texture and spatial distribution. Secondly, for clinical report text, a two-way long and short-term memory network combined with an attention mechanism is used for deep semantic understanding, and key statements related to the disease are accurately captured. The two features interact and integrate effectively through the designed multi-modal fusion layer to realize the joint representation learning of image and text. In the empirical study, we selected a large medical image database covering a variety of diseases, combined with corresponding clinical reports for model training and validation. The proposed multimodal deep learning model demonstrated substantial superiority in the realms of disease classification, lesion localization, and clinical description generation, as evidenced by the experimental results.
Abstract:Access to smart meter data is essential to rapid and successful transitions to electrified grids, underpinned by flexibility delivered by low carbon technologies, such as electric vehicles (EV) and heat pumps, and powered by renewable energy. Yet little of this data is available for research and modelling purposes due consumer privacy protections. Whilst many are calling for raw datasets to be unlocked through regulatory changes, we believe this approach will take too long. Synthetic data addresses these challenges directly by overcoming privacy issues. In this paper, we present Faraday, a Variational Auto-encoder (VAE)-based model trained over 300 million smart meter data readings from an energy supplier in the UK, with information such as property type and low carbon technologies (LCTs) ownership. The model produces household-level synthetic load profiles conditioned on these labels, and we compare its outputs against actual substation readings to show how the model can be used for real-world applications by grid modellers interested in modelling energy grids of the future.