Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ibrahim Khalil

Deep Contrastive Unlearning for Language Models

Mar 19, 2025

Estrid He, Tabinda Sarwar, Ibrahim Khalil, Xun Yi, Ke Wang

Abstract:The past a few years have witnessed the great success of large language models, demonstrating powerful capabilities in comprehending textual data and generating human-like languages. Large language models achieve success by being trained on vast amounts of textual data, including online sources with copyrighted content and user-generated knowledge. However, this comes at a cost: the potential risk of exposing users' privacy and violating copyright protections. Thus, to safeguard individuals' "right to be forgotten", there has been increasing interests in machine unlearning -- the process of removing information carried by particular training samples from a model while not deteriorating its predictive quality. This is a challenging task due to the black-box nature of language models. Most existing studies focus on mitigating the impact of those forgot samples upon a model's outputs, and do not explicitly consider the geometric distributions of samples in the latent space of a model. To address this issue, we propose a machine unlearning framework, named Deep Contrastive Unlearning for fine-Tuning (DeepCUT) language models. Our proposed model achieves machine unlearning by directly optimizing the latent space of a model. Comprehensive experiments on real-world datasets demonstrate the effectiveness and efficiency of DeepCUT with consistent and significant improvement over baseline methods.

Via

Access Paper or Ask Questions

Edge Unlearning is Not "on Edge"! An Adaptive Exact Unlearning System on Resource-Constrained Devices

Oct 15, 2024

Xiaoyu Xia, Ziqi Wang, Ruoxi Sun, Bowen Liu, Ibrahim Khalil, Minhui Xue

Abstract:The right to be forgotten mandates that machine learning models enable the erasure of a data owner's data and information from a trained model. Removing data from the dataset alone is inadequate, as machine learning models can memorize information from the training data, increasing the potential privacy risk to users. To address this, multiple machine unlearning techniques have been developed and deployed. Among them, approximate unlearning is a popular solution, but recent studies report that its unlearning effectiveness is not fully guaranteed. Another approach, exact unlearning, tackles this issue by discarding the data and retraining the model from scratch, but at the cost of considerable computational and memory resources. However, not all devices have the capability to perform such retraining. In numerous machine learning applications, such as edge devices, Internet-of-Things (IoT), mobile devices, and satellites, resources are constrained, posing challenges for deploying existing exact unlearning methods. In this study, we propose a Constraint-aware Adaptive Exact Unlearning System at the network Edge (CAUSE), an approach to enabling exact unlearning on resource-constrained devices. Aiming to minimize the retrain overhead by storing sub-models on the resource-constrained device, CAUSE innovatively applies a Fibonacci-based replacement strategy and updates the number of shards adaptively in the user-based data partition process. To further improve the effectiveness of memory usage, CAUSE leverages the advantage of model pruning to save memory via compression with minimal accuracy sacrifice. The experimental results demonstrate that CAUSE significantly outperforms other representative systems in realizing exact unlearning on the resource-constrained device by 9.23%-80.86%, 66.21%-83.46%, and 5.26%-194.13% in terms of unlearning speed, energy consumption, and accuracy.

* Accepted to IEEE Symposium on Security and Privacy 2025 (Oakland 2025)

Via

Access Paper or Ask Questions

Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios

Jun 24, 2024

Tao Huang, Ziyang Chen, Jiayang Meng, Qingyu Huang, Xu Yang, Xun Yi, Ibrahim Khalil

Figure 1 for Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios

Figure 2 for Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios

Figure 3 for Machine Unlearning with Minimal Gradient Dependence for High Unlearning Ratios

Abstract:In the context of machine unlearning, the primary challenge lies in effectively removing traces of private data from trained models while maintaining model performance and security against privacy attacks like membership inference attacks. Traditional gradient-based unlearning methods often rely on extensive historical gradients, which becomes impractical with high unlearning ratios and may reduce the effectiveness of unlearning. Addressing these limitations, we introduce Mini-Unlearning, a novel approach that capitalizes on a critical observation: unlearned parameters correlate with retrained parameters through contraction mapping. Our method, Mini-Unlearning, utilizes a minimal subset of historical gradients and leverages this contraction mapping to facilitate scalable, efficient unlearning. This lightweight, scalable method significantly enhances model accuracy and strengthens resistance to membership inference attacks. Our experiments demonstrate that Mini-Unlearning not only works under higher unlearning ratios but also outperforms existing techniques in both accuracy and security, offering a promising solution for applications requiring robust unlearning capabilities.

Via

Access Paper or Ask Questions

Smart Policy Control for Securing Federated Learning Management System

May 18, 2023

Aditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammed Atiquzzaman

Abstract:The widespread adoption of Internet of Things (IoT) devices in smart cities, intelligent healthcare systems, and various real-world applications have resulted in the generation of vast amounts of data, often analyzed using different Machine Learning (ML) models. Federated learning (FL) has been acknowledged as a privacy-preserving machine learning technology, where multiple parties cooperatively train ML models without exchanging raw data. However, the current FL architecture does not allow for an audit of the training process due to the various data-protection policies implemented by each FL participant. Furthermore, there is no global model verifiability available in the current architecture. This paper proposes a smart contract-based policy control for securing the Federated Learning (FL) management system. First, we develop and deploy a smart contract-based local training policy control on the FL participants' side. This policy control is used to verify the training process, ensuring that the evaluation process follows the same rules for all FL participants. We then enforce a smart contract-based aggregation policy to manage the global model aggregation process. Upon completion, the aggregated model and policy are stored on blockchain-based storage. Subsequently, we distribute the aggregated global model and the smart contract to all FL participants. Our proposed method uses smart policy control to manage access and verify the integrity of machine learning models. We conducted multiple experiments with various machine learning architectures and datasets to evaluate our proposed framework, such as MNIST and CIFAR-10.

* IEEE Transactions on Network and Service Management, 2023

Via

Access Paper or Ask Questions

Privacy-Preserving Ensemble Infused Enhanced Deep Neural Network Framework for Edge Cloud Convergence

May 16, 2023

Veronika Stephanie, Ibrahim Khalil, Mohammad Saidur Rahman, Mohammed Atiquzzaman

Abstract:We propose a privacy-preserving ensemble infused enhanced Deep Neural Network (DNN) based learning framework in this paper for Internet-of-Things (IoT), edge, and cloud convergence in the context of healthcare. In the convergence, edge server is used for both storing IoT produced bioimage and hosting DNN algorithm for local model training. The cloud is used for ensembling local models. The DNN-based training process of a model with a local dataset suffers from low accuracy, which can be improved by the aforementioned convergence and Ensemble Learning. The ensemble learning allows multiple participants to outsource their local model for producing a generalized final model with high accuracy. Nevertheless, Ensemble Learning elevates the risk of leaking sensitive private data from the final model. The proposed framework presents a Differential Privacy-based privacy-preserving DNN with Transfer Learning for a local model generation to ensure minimal loss and higher efficiency at edge server. We conduct several experiments to evaluate the performance of our proposed framework.

* IEEE Internet of Things Journal, vol. 10, no. 5, pp. 3763-3773, 1 March1, 2023

Via

Access Paper or Ask Questions

Trustworthy Privacy-preserving Hierarchical Ensemble and Federated Learning in Healthcare 4.0 with Blockchain

May 16, 2023

Veronika Stephanie, Ibrahim Khalil, Mohammed Atiquzzaman, Xun Yi

Abstract:The advancement of Internet and Communication Technologies (ICTs) has led to the era of Industry 4.0. This shift is followed by healthcare industries creating the term Healthcare 4.0. In Healthcare 4.0, the use of IoT-enabled medical imaging devices for early disease detection has enabled medical practitioners to increase healthcare institutions' quality of service. However, Healthcare 4.0 is still lagging in Artificial Intelligence and big data compared to other Industry 4.0 due to data privacy concerns. In addition, institutions' diverse storage and computing capabilities restrict institutions from incorporating the same training model structure. This paper presents a secure multi-party computation-based ensemble federated learning with blockchain that enables heterogeneous models to collaboratively learn from healthcare institutions' data without violating users' privacy. Blockchain properties also allow the party to enjoy data integrity without trust in a centralized server while also providing each healthcare institution with auditability and version control capability.

* IEEE Transactions on Industrial Informatics, 2022

Via

Access Paper or Ask Questions

Blockchain-based Access Control for Secure Smart Industry Management Systems

Apr 26, 2023

Aditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammad Saidur Rahman, Abdelaziz Bouras

Abstract:Smart manufacturing systems involve a large number of interconnected devices resulting in massive data generation. Cloud computing technology has recently gained increasing attention in smart manufacturing systems for facilitating cost-effective service provisioning and massive data management. In a cloud-based manufacturing system, ensuring authorized access to the data is crucial. A cloud platform is operated under a single authority. Hence, a cloud platform is prone to a single point of failure and vulnerable to adversaries. An internal or external adversary can easily modify users' access to allow unauthorized users to access the data. This paper proposes a role-based access control to prevent modification attacks by leveraging blockchain and smart contracts in a cloud-based smart manufacturing system. The role-based access control is developed to determine users' roles and rights in smart contracts. The smart contracts are then deployed to the private blockchain network. We evaluate our solution by utilizing Ethereum private blockchain network to deploy the smart contract. The experimental results demonstrate the feasibility and evaluation of the proposed framework's performance.

* Network and System Security: 16th International Conference, NSS 2022, Denarau Island, Fiji, December, 2022

Via

Access Paper or Ask Questions

Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-Things

Apr 25, 2023

Aditya Pribadi Kalapaaking, Ibrahim Khalil, Mohammad Saidur Rahman, Mohammed Atiquzzaman, Xun Yi, Mahathir Almashor

Figure 1 for Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-Things

Figure 2 for Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-Things

Figure 3 for Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-Things

Figure 4 for Blockchain-based Federated Learning with Secure Aggregation in Trusted Execution Environment for Internet-of-Things

Abstract:This paper proposes a blockchain-based Federated Learning (FL) framework with Intel Software Guard Extension (SGX)-based Trusted Execution Environment (TEE) to securely aggregate local models in Industrial Internet-of-Things (IIoTs). In FL, local models can be tampered with by attackers. Hence, a global model generated from the tampered local models can be erroneous. Therefore, the proposed framework leverages a blockchain network for secure model aggregation. Each blockchain node hosts an SGX-enabled processor that securely performs the FL-based aggregation tasks to generate a global model. Blockchain nodes can verify the authenticity of the aggregated model, run a blockchain consensus mechanism to ensure the integrity of the model, and add it to the distributed ledger for tamper-proof storage. Each cluster can obtain the aggregated model from the blockchain and verify its integrity before using it. We conducted several experiments with different CNN models and datasets to evaluate the performance of the proposed framework.

* IEEE Transactions on Industrial Informatics, vol. 19, no. 2, pp. 1703-1714, Feb. 2023

Via

Access Paper or Ask Questions

Deep Learning based Automatic Quantification of Urethral Plate Quality using the Plate Objective Scoring Tool (POST)

Sep 28, 2022

Tariq O. Abbas, Mohamed AbdelMoniem, Ibrahim Khalil, Md Sakib Abrar Hossain, Muhammad E. H. Chowdhury

Figure 1 for Deep Learning based Automatic Quantification of Urethral Plate Quality using the Plate Objective Scoring Tool (POST)

Figure 2 for Deep Learning based Automatic Quantification of Urethral Plate Quality using the Plate Objective Scoring Tool (POST)

Figure 3 for Deep Learning based Automatic Quantification of Urethral Plate Quality using the Plate Objective Scoring Tool (POST)

Abstract:Objectives: To explore the capacity of deep learning algorithm to further streamline and optimize urethral plate (UP) quality appraisal on 2D images using the plate objective scoring tool (POST), aiming to increase the objectivity and reproducibility of UP appraisal in hypospadias repair. Methods: The five key POST landmarks were marked by specialists in a 691-image dataset of prepubertal boys undergoing primary hypospadias repair. This dataset was then used to develop and validate a deep learning-based landmark detection model. The proposed framework begins with glans localization and detection, where the input image is cropped using the predicted bounding box. Next, a deep convolutional neural network (CNN) architecture is used to predict the coordinates of the five POST landmarks. These predicted landmarks are then used to assess UP quality in distal hypospadias. Results: The proposed model accurately localized the glans area, with a mean average precision (mAP) of 99.5% and an overall sensitivity of 99.1%. A normalized mean error (NME) of 0.07152 was achieved in predicting the coordinates of the landmarks, with a mean squared error (MSE) of 0.001 and a 20.2% failure rate at a threshold of 0.1 NME. Conclusions: This deep learning application shows robustness and high precision in using POST to appraise UP quality. Further assessment using international multi-centre image-based databases is ongoing. External validation could benefit deep learning algorithms and lead to better assessments, decision-making and predictions for surgical outcomes.

* 20 pages, 5 figures, 1 table

Via

Access Paper or Ask Questions