Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Birhanu Eshete

DeepProv: Behavioral Characterization and Repair of Neural Networks via Inference Provenance Graph Analysis

Sep 30, 2025

Firas Ben Hmida, Abderrahmen Amich, Ata Kaboudi, Birhanu Eshete

Abstract:Deep neural networks (DNNs) are increasingly being deployed in high-stakes applications, from self-driving cars to biometric authentication. However, their unpredictable and unreliable behaviors in real-world settings require new approaches to characterize and ensure their reliability. This paper introduces DeepProv, a novel and customizable system designed to capture and characterize the runtime behavior of DNNs during inference by using their underlying graph structure. Inspired by system audit provenance graphs, DeepProv models the computational information flow of a DNN's inference process through Inference Provenance Graphs (IPGs). These graphs provide a detailed structural representation of the behavior of DNN, allowing both empirical and structural analysis. DeepProv uses these insights to systematically repair DNNs for specific objectives, such as improving robustness, privacy, or fairness. We instantiate DeepProv with adversarial robustness as the goal of model repair and conduct extensive case studies to evaluate its effectiveness. Our results demonstrate its effectiveness and scalability across diverse classification tasks, attack scenarios, and model complexities. DeepProv automatically identifies repair actions at the node and edge-level within IPGs, significantly enhancing the robustness of the model. In particular, applying DeepProv repair strategies to just a single layer of a DNN yields an average 55% improvement in adversarial accuracy. Moreover, DeepProv complements existing defenses, achieving substantial gains in adversarial robustness. Beyond robustness, we demonstrate the broader potential of DeepProv as an adaptable system to characterize DNN behavior in other critical areas, such as privacy auditing and fairness analysis.

* 18 pages, 9 figures, 6 tables, To appear in the 41st Annual Computer Security Applications Conference (ACSAC), 2025

Via

Access Paper or Ask Questions

Morphence-2.0: Evasion-Resilient Moving Target Defense Powered by Out-of-Distribution Detection

Jun 15, 2022

Abderrahmen Amich, Ata Kaboudi, Birhanu Eshete

Figure 1 for Morphence-2.0: Evasion-Resilient Moving Target Defense Powered by Out-of-Distribution Detection

Figure 2 for Morphence-2.0: Evasion-Resilient Moving Target Defense Powered by Out-of-Distribution Detection

Figure 3 for Morphence-2.0: Evasion-Resilient Moving Target Defense Powered by Out-of-Distribution Detection

Figure 4 for Morphence-2.0: Evasion-Resilient Moving Target Defense Powered by Out-of-Distribution Detection

Abstract:Evasion attacks against machine learning models often succeed via iterative probing of a fixed target model, whereby an attack that succeeds once will succeed repeatedly. One promising approach to counter this threat is making a model a moving target against adversarial inputs. To this end, we introduce Morphence-2.0, a scalable moving target defense (MTD) powered by out-of-distribution (OOD) detection to defend against adversarial examples. By regularly moving the decision function of a model, Morphence-2.0 makes it significantly challenging for repeated or correlated attacks to succeed. Morphence-2.0 deploys a pool of models generated from a base model in a manner that introduces sufficient randomness when it responds to prediction queries. Via OOD detection, Morphence-2.0 is equipped with a scheduling approach that assigns adversarial examples to robust decision functions and benign samples to an undefended accurate models. To ensure repeated or correlated attacks fail, the deployed pool of models automatically expires after a query budget is reached and the model pool is seamlessly replaced by a new model pool generated in advance. We evaluate Morphence-2.0 on two benchmark image classification datasets (MNIST and CIFAR10) against 4 reference attacks (3 white-box and 1 black-box). Morphence-2.0 consistently outperforms prior defenses while preserving accuracy on clean data and reducing attack transferability. We also show that, when powered by OOD detection, Morphence-2.0 is able to precisely make an input-based movement of the model's decision function that leads to higher prediction accuracy on both adversarial and benign queries.

* 13 pages, 6 figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:2108.13952

Via

Access Paper or Ask Questions

MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members

Mar 02, 2022

Ismat Jarin, Birhanu Eshete

Figure 1 for MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members

Figure 2 for MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members

Figure 3 for MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members

Figure 4 for MIAShield: Defending Membership Inference Attacks via Preemptive Exclusion of Members

Abstract:In membership inference attacks (MIAs), an adversary observes the predictions of a model to determine whether a sample is part of the model's training data. Existing MIA defenses conceal the presence of a target sample through strong regularization, knowledge distillation, confidence masking, or differential privacy. We propose MIAShield, a new MIA defense based on preemptive exclusion of member samples instead of masking the presence of a member. The key insight in MIAShield is weakening the strong membership signal that stems from the presence of a target sample by preemptively excluding it at prediction time without compromising model utility. To that end, we design and evaluate a suite of preemptive exclusion oracles leveraging model-confidence, exact or approximate sample signature, and learning-based exclusion of member data points. To be practical, MIAShield splits a training data into disjoint subsets and trains each subset to build an ensemble of models. The disjointedness of subsets ensures that a target sample belongs to only one subset, which isolates the sample to facilitate the preemptive exclusion goal. We evaluate MIAShield on three benchmark image classification datasets. We show that MIAShield effectively mitigates membership inference (near random guess) for a wide range of MIAs, achieves far better privacy-utility trade-off compared with state-of-the-art defenses, and remains resilient against an adaptive adversary.

* 21 pages, 17 figures, 10 tables

Via

Access Paper or Ask Questions

Rethinking Machine Learning Robustness via its Link with the Out-of-Distribution Problem

Feb 18, 2022

Abderrahmen Amich, Birhanu Eshete

Figure 1 for Rethinking Machine Learning Robustness via its Link with the Out-of-Distribution Problem

Figure 2 for Rethinking Machine Learning Robustness via its Link with the Out-of-Distribution Problem

Figure 3 for Rethinking Machine Learning Robustness via its Link with the Out-of-Distribution Problem

Figure 4 for Rethinking Machine Learning Robustness via its Link with the Out-of-Distribution Problem

Abstract:Despite multiple efforts made towards robust machine learning (ML) models, their vulnerability to adversarial examples remains a challenging problem that calls for rethinking the defense strategy. In this paper, we take a step back and investigate the causes behind ML models' susceptibility to adversarial examples. In particular, we focus on exploring the cause-effect link between adversarial examples and the out-of-distribution (OOD) problem. To that end, we propose an OOD generalization method that stands against both adversary-induced and natural distribution shifts. Through an OOD to in-distribution mapping intuition, our approach translates OOD inputs to the data distribution used to train and test the model. Through extensive experiments on three benchmark image datasets of different scales (MNIST, CIFAR10, and ImageNet) and by leveraging image-to-image translation methods, we confirm that the adversarial examples problem is a special case of the wider OOD generalization problem. Across all datasets, we show that our translation-based approach consistently improves robustness to OOD adversarial inputs and outperforms state-of-the-art defenses by a significant margin, while preserving the exact accuracy on benign (in-distribution) data. Furthermore, our method generalizes on naturally OOD inputs such as darker or sharper images

* 13 pages

Via

Access Paper or Ask Questions

DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Dec 24, 2021

Ismat Jarin, Birhanu Eshete

Figure 1 for DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Figure 2 for DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Figure 3 for DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Figure 4 for DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in Machine Learning

Abstract:Differential Privacy (DP) has emerged as a rigorous formalism to reason about quantifiable privacy leakage. In machine learning (ML), DP has been employed to limit inference/disclosure of training examples. Prior work leveraged DP across the ML pipeline, albeit in isolation, often focusing on mechanisms such as gradient perturbation. In this paper, we present, DP-UTIL, a holistic utility analysis framework of DP across the ML pipeline with focus on input perturbation, objective perturbation, gradient perturbation, output perturbation, and prediction perturbation. Given an ML task on privacy-sensitive data, DP-UTIL enables a ML privacy practitioner perform holistic comparative analysis on the impact of DP in these five perturbation spots, measured in terms of model utility loss, privacy leakage, and the number of truly revealed training samples. We evaluate DP-UTIL over classification tasks on vision, medical, and financial datasets, using two representative learning algorithms (logistic regression and deep neural network) against membership inference attack as a case study attack. One of the highlights of our results is that prediction perturbation consistently achieves the lowest utility loss on all models across all datasets. In logistic regression models, objective perturbation results in lowest privacy leakage compared to other perturbation techniques. For deep neural networks, gradient perturbation results in lowest privacy leakage. Moreover, our results on true revealed records suggest that as privacy leakage increases a differentially private model reveals more number of member samples. Overall, our findings suggest that to make informed decisions as to which perturbation mechanism to use, a ML privacy practitioner needs to examine the dynamics between optimization techniques (convex vs. non-convex), perturbation mechanisms, number of classes, and privacy budget.

* To appear in proceedings of the 12th ACM Conference on Data and Application Security and Privacy (CODASPY 2022)

Via

Access Paper or Ask Questions

EG-Booster: Explanation-Guided Booster of ML Evasion Attacks

Sep 02, 2021

Abderrahmen Amich, Birhanu Eshete

Figure 1 for EG-Booster: Explanation-Guided Booster of ML Evasion Attacks

Figure 2 for EG-Booster: Explanation-Guided Booster of ML Evasion Attacks

Figure 3 for EG-Booster: Explanation-Guided Booster of ML Evasion Attacks

Figure 4 for EG-Booster: Explanation-Guided Booster of ML Evasion Attacks

Abstract:The widespread usage of machine learning (ML) in a myriad of domains has raised questions about its trustworthiness in security-critical environments. Part of the quest for trustworthy ML is robustness evaluation of ML models to test-time adversarial examples. Inline with the trustworthy ML goal, a useful input to potentially aid robustness evaluation is feature-based explanations of model predictions. In this paper, we present a novel approach called EG-Booster that leverages techniques from explainable ML to guide adversarial example crafting for improved robustness evaluation of ML models before deploying them in security-critical settings. The key insight in EG-Booster is the use of feature-based explanations of model predictions to guide adversarial example crafting by adding consequential perturbations likely to result in model evasion and avoiding non-consequential ones unlikely to contribute to evasion. EG-Booster is agnostic to model architecture, threat model, and supports diverse distance metrics used previously in the literature. We evaluate EG-Booster using image classification benchmark datasets, MNIST and CIFAR10. Our findings suggest that EG-Booster significantly improves evasion rate of state-of-the-art attacks while performing less number of perturbations. Through extensive experiments that covers four white-box and three black-box attacks, we demonstrate the effectiveness of EG-Booster against two undefended neural networks trained on MNIST and CIFAR10, and another adversarially-trained ResNet model trained on CIFAR10. Furthermore, we introduce a stability assessment metric and evaluate the reliability of our explanation-based approach by observing the similarity between the model's classification outputs across multiple runs of EG-Booster.

Via

Access Paper or Ask Questions

Morphence: Moving Target Defense Against Adversarial Examples

Sep 02, 2021

Abderrahmen Amich, Birhanu Eshete

Figure 1 for Morphence: Moving Target Defense Against Adversarial Examples

Figure 2 for Morphence: Moving Target Defense Against Adversarial Examples

Figure 3 for Morphence: Moving Target Defense Against Adversarial Examples

Figure 4 for Morphence: Moving Target Defense Against Adversarial Examples

Abstract:Robustness to adversarial examples of machine learning models remains an open topic of research. Attacks often succeed by repeatedly probing a fixed target model with adversarial examples purposely crafted to fool it. In this paper, we introduce Morphence, an approach that shifts the defense landscape by making a model a moving target against adversarial examples. By regularly moving the decision function of a model, Morphence makes it significantly challenging for repeated or correlated attacks to succeed. Morphence deploys a pool of models generated from a base model in a manner that introduces sufficient randomness when it responds to prediction queries. To ensure repeated or correlated attacks fail, the deployed pool of models automatically expires after a query budget is reached and the model pool is seamlessly replaced by a new model pool generated in advance. We evaluate Morphence on two benchmark image classification datasets (MNIST and CIFAR10) against five reference attacks (2 white-box and 3 black-box). In all cases, Morphence consistently outperforms the thus-far effective defense, adversarial training, even in the face of strong white-box attacks, while preserving accuracy on clean data.

* ACSAC 2021 - Annual Computer Security Applications Conference

Via

Access Paper or Ask Questions

Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

Jun 30, 2021

Abderrahmen Amich, Birhanu Eshete

Figure 1 for Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

Figure 2 for Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

Figure 3 for Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

Figure 4 for Explanation-Guided Diagnosis of Machine Learning Evasion Attacks

Abstract:Machine Learning (ML) models are susceptible to evasion attacks. Evasion accuracy is typically assessed using aggregate evasion rate, and it is an open question whether aggregate evasion rate enables feature-level diagnosis on the effect of adversarial perturbations on evasive predictions. In this paper, we introduce a novel framework that harnesses explainable ML methods to guide high-fidelity assessment of ML evasion attacks. Our framework enables explanation-guided correlation analysis between pre-evasion perturbations and post-evasion explanations. Towards systematic assessment of ML evasion attacks, we propose and evaluate a novel suite of model-agnostic metrics for sample-level and dataset-level correlation analysis. Using malware and image classifiers, we conduct comprehensive evaluations across diverse model architectures and complementary feature representations. Our explanation-guided correlation analysis reveals correlation gaps between adversarial samples and the corresponding perturbations performed on them. Using a case study on explanation-guided evasion, we show the broader usage of our methodology for assessing robustness of ML models.

* To appear in the proceedings of the 17th EAI International Conference on Security and Privacy in Communication Networks (SecureComm 2021)

Via

Access Paper or Ask Questions

PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

Feb 19, 2021

Ismat Jarin, Birhanu Eshete

Figure 1 for PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

Figure 2 for PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

Figure 3 for PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

Figure 4 for PRICURE: Privacy-Preserving Collaborative Inference in a Multi-Party Setting

Abstract:When multiple parties that deal with private data aim for a collaborative prediction task such as medical image classification, they are often constrained by data protection regulations and lack of trust among collaborating parties. If done in a privacy-preserving manner, predictive analytics can benefit from the collective prediction capability of multiple parties holding complementary datasets on the same machine learning task. This paper presents PRICURE, a system that combines complementary strengths of secure multi-party computation (SMPC) and differential privacy (DP) to enable privacy-preserving collaborative prediction among multiple model owners. SMPC enables secret-sharing of private models and client inputs with non-colluding secure servers to compute predictions without leaking model parameters and inputs. DP masks true prediction results via noisy aggregation so as to deter a semi-honest client who may mount membership inference attacks. We evaluate PRICURE on neural networks across four datasets including benchmark medical image classification datasets. Our results suggest PRICURE guarantees privacy for tens of model owners and clients with acceptable accuracy loss. We also show that DP reduces membership inference attack exposure without hurting accuracy.

* 12 pages, 9 figures, to appear in the proceedings of the 7th ACM International Workshop on Security and Privacy Analytics (IWSPA'21) co-located with ACM CODASPY'21

Via

Access Paper or Ask Questions

Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Jun 28, 2020

Abdullah Ali, Birhanu Eshete

Figure 1 for Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Figure 2 for Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Figure 3 for Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Figure 4 for Best-Effort Adversarial Approximation of Black-Box Malware Classifiers

Abstract:An adversary who aims to steal a black-box model repeatedly queries the model via a prediction API to learn a function that approximates its decision boundary. Adversarial approximation is non-trivial because of the enormous combinations of model architectures, parameters, and features to explore. In this context, the adversary resorts to a best-effort strategy that yields the closest approximation. This paper explores best-effort adversarial approximation of a black-box malware classifier in the most challenging setting, where the adversary's knowledge is limited to a prediction label for a given input. Beginning with a limited input set for the black-box classifier, we leverage feature representation mapping and cross-domain transferability to approximate a black-box malware classifier by locally training a substitute. Our approach approximates the target model with different feature types for the target and the substitute model while also using non-overlapping data for training the target, training the substitute, and the comparison of the two. We evaluate the effectiveness of our approach against two black-box classifiers trained on Windows Portable Executables (PEs). Against a Convolutional Neural Network (CNN) trained on raw byte sequences of PEs, our approach achieves a 92% accurate substitute (trained on pixel representations of PEs), and nearly 90% prediction agreement between the target and the substitute model. Against a 97.8% accurate gradient boosted decision tree trained on static PE features, our 91% accurate substitute agrees with the black-box on 90% of predictions, suggesting the strength of our purely black-box approximation.

* 24 pages, 19 figures, 5 tables, to appear in the proceedings of the 16th EAI International Conference on Security and Privacy in Communication Networks (SECURECOMM'20)

Via

Access Paper or Ask Questions