Abstract:Neural networks are vulnerable to adversarial attacks, and several defenses have been proposed. Designing a robust network is a challenging task given the wide range of attacks that have been developed. Therefore, we aim to provide insight into the influence of network similarity on the success rate of transferred adversarial attacks. Network designers can then compare their new network with existing ones to estimate its vulnerability. To achieve this, we investigate the complex relationship between network similarity and the success rate of transferred adversarial attacks. We applied the Centered Kernel Alignment (CKA) network similarity score and used various methods to find a correlation between a large number of Convolutional Neural Networks (CNNs) and adversarial attacks. Network similarity was found to be moderate across different CNN architectures, with more complex models such as DenseNet showing lower similarity scores due to their architectural complexity. Layer similarity was highest for consistent, basic layers such as DataParallel, Dropout and Conv2d, while specialized layers showed greater variability. Adversarial attack success rates were generally consistent for non-transferred attacks, but varied significantly for some transferred attacks, with complex networks being more vulnerable. We found that a DecisionTreeRegressor can predict the success rate of transferred attacks for all black-box and Carlini & Wagner attacks with an accuracy of over 90%, suggesting that predictive models may be viable under certain conditions. However, the variability of results across different data subsets underscores the complexity of these relationships and suggests that further research is needed to generalize these findings across different attack scenarios and network architectures.
Abstract:Cujo AI and Adversa AI hosted the MLSec face recognition challenge. The goal was to attack a black box face recognition model with targeted attacks. The model returned the confidence of the target class and a stealthiness score. For an attack to be considered successful the target class has to have the highest confidence among all classes and the stealthiness has to be at least 0.5. In our approach we paste the face of a target into a source image. By utilizing position, scaling, rotation and transparency attributes we reached 3rd place. Our approach took approximately 200 queries per attack for the final highest score and about ~7.7 queries minimum for a successful attack. The code is available at https://github.com/bunni90/FacePastingAttack .