Abstract:The key components of machine learning are data samples for training, model for learning patterns, and loss function for optimizing accuracy. Analogously, unlearning can potentially be achieved through anti-data samples (or anti-samples), unlearning method, and reversed loss function. While prior research has explored unlearning methods and reversed loss functions, the potential of anti-samples remains largely untapped. In this paper, we introduce UnSTAR: Unlearning with Self-Taught Anti-Sample Reasoning for large language models (LLMs). Our contributions are threefold; first, we propose a novel concept of anti-sample-induced unlearning; second, we generate anti-samples by leveraging misleading rationales, which help reverse learned associations and accelerate the unlearning process; and third, we enable fine-grained targeted unlearning, allowing for the selective removal of specific associations without impacting related knowledge - something not achievable by previous works. Results demonstrate that anti-samples offer an efficient, targeted unlearning strategy for LLMs, opening new avenues for privacy-preserving machine learning and model modification.
Abstract:Unlearning methods for recommender systems (RS) have emerged to address privacy issues and concerns about legal compliance. However, evolving user preferences and content licensing issues still remain unaddressed. This is particularly true in case of multi-modal recommender systems (MMRS), which aim to accommodate the growing influence of multi-modal information on user preferences. Previous unlearning methods for RS are inapplicable to MMRS due to incompatibility of multi-modal user-item behavior data graph with the matrix based representation of RS. Partitioning based methods degrade recommendation performance and incur significant overhead costs during aggregation. This paper introduces MMRecUN, a new framework for multi-modal recommendation unlearning, which, to the best of our knowledge, is the first attempt in this direction. Given the trained recommendation model and marked forget data, we devise Reverse Bayesian Personalized Ranking (BPR) objective to force the model to forget it. MMRecUN employs both reverse and forward BPR loss mechanisms to selectively attenuate the impact of interactions within the forget set while concurrently reinforcing the significance of interactions within the retain set. Our experiments demonstrate that MMRecUN outperforms baseline methods across various unlearning requests when evaluated on benchmark multi-modal recommender datasets. MMRecUN achieves recall performance improvements of up to $\mathbf{49.85%}$ compared to the baseline methods. It is up to $\mathbf{1.3}\times$ faster than the \textsc{Gold} model, which is trained on retain data from scratch. MMRecUN offers advantages such as superior performance in removing target elements, preservation of performance for retained elements, and zero overhead costs in comparison to previous methods.
Abstract:Graph unlearning has emerged as a pivotal method to delete information from a pre-trained graph neural network (GNN). One may delete nodes, a class of nodes, edges, or a class of edges. An unlearning method enables the GNN model to comply with data protection regulations (i.e., the right to be forgotten), adapt to evolving data distributions, and reduce the GPU-hours carbon footprint by avoiding repetitive retraining. Existing partitioning and aggregation-based methods have limitations due to their poor handling of local graph dependencies and additional overhead costs. More recently, GNNDelete offered a model-agnostic approach that alleviates some of these issues. Our work takes a novel approach to address these challenges in graph unlearning through knowledge distillation, as it distills to delete in GNN (D2DGN). It is a model-agnostic distillation framework where the complete graph knowledge is divided and marked for retention and deletion. It performs distillation with response-based soft targets and feature-based node embedding while minimizing KL divergence. The unlearned model effectively removes the influence of deleted graph elements while preserving knowledge about the retained graph elements. D2DGN surpasses the performance of existing methods when evaluated on various real-world graph datasets by up to $43.1\%$ (AUC) in edge and node unlearning tasks. Other notable advantages include better efficiency, better performance in removing target elements, preservation of performance for the retained elements, and zero overhead costs. Notably, our D2DGN surpasses the state-of-the-art GNNDelete in AUC by $2.4\%$, improves membership inference ratio by $+1.3$, requires $10.2\times10^6$ fewer FLOPs per forward pass and up to $\mathbf{3.2}\times$ faster.