Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Panagiota Kiourti

Dormant Neural Trojans

Nov 02, 2022

Feisi Fu, Panagiota Kiourti, Wenchao Li

Abstract:We present a novel methodology for neural network backdoor attacks. Unlike existing training-time attacks where the Trojaned network would respond to the Trojan trigger after training, our approach inserts a Trojan that will remain dormant until it is activated. The activation is realized through a specific perturbation to the network's weight parameters only known to the attacker. Our analysis and the experimental results demonstrate that dormant Trojaned networks can effectively evade detection by state-of-the-art backdoor detection methods.

Via

Access Paper or Ask Questions

Online Defense of Trojaned Models using Misattributions

Mar 29, 2021

Panagiota Kiourti, Wenchao Li, Anirban Roy, Karan Sikka, Susmit Jha

Figure 1 for Online Defense of Trojaned Models using Misattributions

Figure 2 for Online Defense of Trojaned Models using Misattributions

Figure 3 for Online Defense of Trojaned Models using Misattributions

Figure 4 for Online Defense of Trojaned Models using Misattributions

Abstract:This paper proposes a new approach to detecting neural Trojans on Deep Neural Networks during inference. This approach is based on monitoring the inference of a machine learning model, computing the attribution of the model's decision on different features of the input, and then statistically analyzing these attributions to detect whether an input sample contains the Trojan trigger. The anomalous attributions, aka misattributions, are then accompanied by reverse-engineering of the trigger to evaluate whether the input sample is truly poisoned with a Trojan trigger. We evaluate our approach on several benchmarks, including models trained on MNIST, Fashion MNIST, and German Traffic Sign Recognition Benchmark, and demonstrate the state of the art detection accuracy.

Via

Access Paper or Ask Questions

TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

Mar 01, 2019

Panagiota Kiourti, Kacper Wardega, Susmit Jha, Wenchao Li

Figure 1 for TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

Figure 2 for TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

Figure 3 for TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

Figure 4 for TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents

Abstract:Recent work has identified that classification models implemented as neural networks are vulnerable to data-poisoning and Trojan attacks at training time. In this work, we show that these training-time vulnerabilities extend to deep reinforcement learning (DRL) agents and can be exploited by an adversary with access to the training process. In particular, we focus on Trojan attacks that augment the function of reinforcement learning policies with hidden behaviors. We demonstrate that such attacks can be implemented through minuscule data poisoning (as little as 0.025% of the training data) and in-band reward modification that does not affect the reward on normal inputs. The policies learned with our proposed attack approach perform imperceptibly similar to benign policies but deteriorate drastically when the Trojan is triggered in both targeted and untargeted settings. Furthermore, we show that existing Trojan defense mechanisms for classification tasks are not effective in the reinforcement learning setting.

Via

Access Paper or Ask Questions