Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kostis Gourgoulias

DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Nov 17, 2023

Jiaeli Shi, Najah Ghalyan, Kostis Gourgoulias, John Buford, Sean Moran

Figure 1 for DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Figure 2 for DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Figure 3 for DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Figure 4 for DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Abstract:Machine learning models trained on sensitive or private data can inadvertently memorize and leak that information. Machine unlearning seeks to retroactively remove such details from model weights to protect privacy. We contribute a lightweight unlearning algorithm that leverages the Fisher Information Matrix (FIM) for selective forgetting. Prior work in this area requires full retraining or large matrix inversions, which are computationally expensive. Our key insight is that the diagonal elements of the FIM, which measure the sensitivity of log-likelihood to changes in weights, contain sufficient information for effective forgetting. Specifically, we compute the FIM diagonal over two subsets -- the data to retain and forget -- for all trainable weights. This diagonal representation approximates the complete FIM while dramatically reducing computation. We then use it to selectively update weights to maximize forgetting of the sensitive subset while minimizing impact on the retained subset. Experiments show that our algorithm can successfully forget any randomly selected subsets of training data across neural network architectures. By leveraging the FIM diagonal, our approach provides an interpretable, lightweight, and efficient solution for machine unlearning with practical privacy benefits.

Via

Access Paper or Ask Questions

An Unsupervised Method for Estimating Class Separability of Datasets with Application to LLMs Fine-Tuning

May 24, 2023

Najah Ghalyan, Kostis Gourgoulias, Yash Satsangi, Sean Moran, Maxime Labonne, Joseph Sabelja

Figure 1 for An Unsupervised Method for Estimating Class Separability of Datasets with Application to LLMs Fine-Tuning

Figure 2 for An Unsupervised Method for Estimating Class Separability of Datasets with Application to LLMs Fine-Tuning

Figure 3 for An Unsupervised Method for Estimating Class Separability of Datasets with Application to LLMs Fine-Tuning

Figure 4 for An Unsupervised Method for Estimating Class Separability of Datasets with Application to LLMs Fine-Tuning

Abstract:This paper proposes an unsupervised method that leverages topological characteristics of data manifolds to estimate class separability of the data without requiring labels. Experiments conducted in this paper on several datasets demonstrate a clear correlation and consistency between the class separability estimated by the proposed method with supervised metrics like Fisher Discriminant Ratio~(FDR) and cross-validation of a classifier, which both require labels. This can enable implementing learning paradigms aimed at learning from both labeled and unlabeled data, like semi-supervised and transductive learning. This would be particularly useful when we have limited labeled data and a relatively large unlabeled dataset that can be used to enhance the learning process. The proposed method is implemented for language model fine-tuning with automated stopping criterion by monitoring class separability of the embedding-space manifold in an unsupervised setting. The proposed methodology has been first validated on synthetic data, where the results show a clear consistency between class separability estimated by the proposed method and class separability computed by FDR. The method has been also implemented on both public and internal data. The results show that the proposed method can effectively aid -- without the need for labels -- a decision on when to stop or continue the fine-tuning of a language model and which fine-tuning iteration is expected to achieve a maximum classification performance through quantification of the class separability of the embedding manifold.

Via

Access Paper or Ask Questions

Learning medical triage from clinicians using Deep Q-Learning

Mar 28, 2020

Albert Buchard, Baptiste Bouvier, Giulia Prando, Rory Beard, Michail Livieratos, Dan Busbridge, Daniel Thompson, Jonathan Richens, Yuanzhao Zhang, Adam Baker(+3 more)

Figure 1 for Learning medical triage from clinicians using Deep Q-Learning

Figure 2 for Learning medical triage from clinicians using Deep Q-Learning

Figure 3 for Learning medical triage from clinicians using Deep Q-Learning

Figure 4 for Learning medical triage from clinicians using Deep Q-Learning

Abstract:Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other hand, learning triage policies directly from experts may correct for some of the limitations of hard-coded decision-trees. In this work, we present a Deep Reinforcement Learning approach (a variant of DeepQ-Learning) to triage patients using curated clinical vignettes. The dataset, consisting of 1374 clinical vignettes, was created by medical doctors to represent real-life cases. Each vignette is associated with an average of 3.8 expert triage decisions given by medical doctors relying solely on medical history. We show that this approach is on a par with human performance, yielding safe triage decisions in 94% of cases, and matching expert decisions in 85% of cases. The trained agent learns when to stop asking questions, acquires optimized decision policies requiring less evidence than supervised approaches, and adapts to the novelty of a situation by asking for more information. Overall, we demonstrate that a Deep Reinforcement Learning approach can learn effective medical triage policies directly from expert decisions, without requiring expert knowledge engineering. This approach is scalable and can be deployed in healthcare settings or geographical regions with distinct triage specifications, or where trained experts are scarce, to improve decision making in the early stage of care.

* 17 pages, 4 figures, 3 tables, preprint, in press

Via

Access Paper or Ask Questions

Masking schemes for universal marginalisers

Jan 16, 2020

Divya Gautam, Maria Lomeli, Kostis Gourgoulias, Daniel H. Thompson, Saurabh Johri

Figure 1 for Masking schemes for universal marginalisers

Figure 2 for Masking schemes for universal marginalisers

Figure 3 for Masking schemes for universal marginalisers

Figure 4 for Masking schemes for universal marginalisers

Abstract:We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with different masking schemes in terms of their predictive performance and generalisation properties.

* To be published in Proceedings of the 2nd Symposium on Advances in Approximate Bayesian Inference, 2019

Via

Access Paper or Ask Questions

MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Oct 17, 2019

Yura Perov, Logan Graham, Kostis Gourgoulias, Jonathan G. Richens, Ciarán M. Lee, Adam Baker, Saurabh Johri

Figure 1 for MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Figure 2 for MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Figure 3 for MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Figure 4 for MultiVerse: Causal Reasoning using Importance Sampling in Probabilistic Programming

Abstract:We elaborate on using importance sampling for causal reasoning, in particular for counterfactual inference. We show how this can be implemented natively in probabilistic programming. By considering the structure of the counterfactual query, one can significantly optimise the inference process. We also consider design choices to enable further optimisations. We introduce MultiVerse, a probabilistic programming prototype engine for approximate causal reasoning. We provide experimental results and compare with Pyro, an existing probabilistic programming framework with some of causal reasoning tools.

* Logan and Yura have made equal contributions to the paper

Via

Access Paper or Ask Questions

Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

Oct 16, 2019

Robert Walecki, Kostis Gourgoulias, Adam Baker, Chris Hart, Chris Lucas, Max Zwiessele, Albert Buchard, Maria Lomeli, Yura Perov, Saurabh Johri

Figure 1 for Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

Figure 2 for Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

Figure 3 for Universal Marginaliser for Deep Amortised Inference for Probabilistic Programs

Abstract:Probabilistic programming languages (PPLs) are powerful modelling tools which allow to formalise our knowledge about the world and reason about its inherent uncertainty. Inference methods used in PPL can be computationally costly due to significant time burden and/or storage requirements; or they can lack theoretical guarantees of convergence and accuracy when applied to large scale graphical models. To this end, we present the Universal Marginaliser (UM), a novel method for amortised inference, in PPL. We show how combining samples drawn from the original probabilistic program prior with an appropriate augmentation method allows us to train one neural network to approximate any of the corresponding conditional marginal distributions, with any separation into latent and observed variables, and thus amortise the cost of inference. Finally, we benchmark the method on multiple probabilistic programs, in Pyro, with different model structure.

Via

Access Paper or Ask Questions

Universal Marginalizer for Amortised Inference and Embedding of Generative Models

Nov 12, 2018

Robert Walecki, Albert Buchard, Kostis Gourgoulias, Chris Hart, Maria Lomeli, A. K. W. Navarro, Max Zwiessele, Yura Perov, Saurabh Johri

Figure 1 for Universal Marginalizer for Amortised Inference and Embedding of Generative Models

Figure 2 for Universal Marginalizer for Amortised Inference and Embedding of Generative Models

Figure 3 for Universal Marginalizer for Amortised Inference and Embedding of Generative Models

Figure 4 for Universal Marginalizer for Amortised Inference and Embedding of Generative Models

Abstract:Probabilistic graphical models are powerful tools which allow us to formalise our knowledge about the world and reason about its inherent uncertainty. There exist a considerable number of methods for performing inference in probabilistic graphical models; however, they can be computationally costly due to significant time burden and/or storage requirements; or they lack theoretical guarantees of convergence and accuracy when applied to large scale graphical models. To this end, we propose the Universal Marginaliser Importance Sampler (UM-IS) -- a hybrid inference scheme that combines the flexibility of a deep neural network trained on samples from the model and inherits the asymptotic guarantees of importance sampling. We show how combining samples drawn from the graphical model with an appropriate masking function allows us to train a single neural network to approximate any of the corresponding conditional marginal distributions, and thus amortise the cost of inference. We also show that the graph embeddings can be applied for tasks such as: clustering, classification and interpretation of relationships between the nodes. Finally, we benchmark the method on a large graph (>1000 nodes), showing that UM-IS outperforms sampling-based methods by a large margin while being computationally efficient.

Via

Access Paper or Ask Questions