Abstract:The need for more transparent face recognition (FR), along with other visual-based decision-making systems has recently attracted more attention in research, society, and industry. The reasons why two face images are matched or not matched by a deep learning-based face recognition system are not obvious due to the high number of parameters and the complexity of the models. However, it is important for users, operators, and developers to ensure trust and accountability of the system and to analyze drawbacks such as biased behavior. While many previous works use spatial semantic maps to highlight the regions that have a significant influence on the decision of the face recognition system, frequency components which are also considered by CNNs, are neglected. In this work, we take a step forward and investigate explainable face recognition in the unexplored frequency domain. This makes this work the first to propose explainability of verification-based decisions in the frequency domain, thus explaining the relative influence of the frequency components of each input toward the obtained outcome. To achieve this, we manipulate face images in the spatial frequency domain and investigate the impact on verification outcomes. In extensive quantitative experiments, along with investigating two special scenarios cases, cross-resolution FR and morphing attacks (the latter in supplementary material), we observe the applicability of our proposed frequency-based explanations.
Abstract:Decision processes of computer vision models - especially deep neural networks - are opaque in nature, meaning that these decisions cannot be understood by humans. Thus, over the last years, many methods to provide human-understandable explanations have been proposed. For image classification, the most common group are saliency methods, which provide (super-)pixelwise feature attribution scores for input images. But their evaluation still poses a problem, as their results cannot be simply compared to the unknown ground truth. To overcome this, a slew of different proxy metrics have been defined, which are - as the explainability methods themselves - often built on intuition and thus, are possibly unreliable. In this paper, new evaluation metrics for saliency methods are developed and common saliency methods are benchmarked on ImageNet. In addition, a scheme for reliability evaluation of such metrics is proposed that is based on concepts from psychometric testing. The used code can be found at https://github.com/lelo204/ClassificationMetricsForImageExplanations .
Abstract:This paper investigates the relationship between law and eXplainable Artificial Intelligence (XAI). While there is much discussion about the AI Act, for which the trilogue of the European Parliament, Council and Commission recently concluded, other areas of law seem underexplored. This paper focuses on European (and in part German) law, although with international concepts and regulations such as fiduciary plausibility checks, the General Data Protection Regulation (GDPR), and product safety and liability. Based on XAI-taxonomies, requirements for XAI-methods are derived from each of the legal bases, resulting in the conclusion that each legal basis requires different XAI properties and that the current state of the art does not fulfill these to full satisfaction, especially regarding the correctness (sometimes called fidelity) and confidence estimates of XAI-methods.
Abstract:Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
Abstract:The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity, limiting their usefulness in human-oriented applications, which are usually deployed in resource-constrained devices. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation. This paper aims to systematize the current literature on this topic by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning. We conduct a critical analysis of the comparative value of these techniques, focusing on their advantages and disadvantages and presenting suggestions for future work directions that can potentially improve the current methods. Additionally, we discuss and analyze the link between model bias and model compression, highlighting the need to direct compression research toward model fairness in future works.
Abstract:This paper presents a summary of the Competition on Face Presentation Attack Detection Based on Privacy-aware Synthetic Training Data (SynFacePAD 2023) held at the 2023 International Joint Conference on Biometrics (IJCB 2023). The competition attracted a total of 8 participating teams with valid submissions from academia and industry. The competition aimed to motivate and attract solutions that target detecting face presentation attacks while considering synthetic-based training data motivated by privacy, legal and ethical concerns associated with personal data. To achieve that, the training data used by the participants was limited to synthetic data provided by the organizers. The submitted solutions presented innovations and novel approaches that led to outperforming the considered baseline in the investigated benchmarks.
Abstract:Synthetic data is emerging as a substitute for authentic data to solve ethical and legal challenges in handling authentic face data. The current models can create real-looking face images of people who do not exist. However, it is a known and sensitive problem that face recognition systems are susceptible to bias, i.e. performance differences between different demographic and non-demographics attributes, which can lead to unfair decisions. In this work, we investigate how the diversity of synthetic face recognition datasets compares to authentic datasets, and how the distribution of the training data of the generative models affects the distribution of the synthetic data. To do this, we looked at the distribution of gender, ethnicity, age, and head position. Furthermore, we investigated the concrete bias of three recent synthetic-based face recognition models on the studied attributes in comparison to a baseline model trained on authentic data. Our results show that the generator generate a similar distribution as the used training data in terms of the different attributes. With regard to bias, it can be seen that the synthetic-based models share a similar bias behavior with the authentic-based models. However, with the uncovered lower intra-identity attribute consistency seems to be beneficial in reducing bias.
Abstract:Liveness Detection (LivDet) is an international competition series open to academia and industry with the objec-tive to assess and report state-of-the-art in Presentation Attack Detection (PAD). LivDet-2023 Noncontact Fingerprint is the first edition of the noncontact fingerprint-based PAD competition for algorithms and systems. The competition serves as an important benchmark in noncontact-based fingerprint PAD, offering (a) independent assessment of the state-of-the-art in noncontact-based fingerprint PAD for algorithms and systems, and (b) common evaluation protocol, which includes finger photos of a variety of Presentation Attack Instruments (PAIs) and live fingers to the biometric research community (c) provides standard algorithm and system evaluation protocols, along with the comparative analysis of state-of-the-art algorithms from academia and industry with both old and new android smartphones. The winning algorithm achieved an APCER of 11.35% averaged overall PAIs and a BPCER of 0.62%. The winning system achieved an APCER of 13.0.4%, averaged over all PAIs tested over all the smartphones, and a BPCER of 1.68% over all smartphones tested. Four-finger systems that make individual finger-based PAD decisions were also tested. The dataset used for competition will be available 1 to all researchers as per data share protocol
Abstract:Face recognition (FR) systems continue to spread in our daily lives with an increasing demand for higher explainability and interpretability of FR systems that are mainly based on deep learning. While bias across demographic groups in FR systems has already been studied, the bias of explainability tools has not yet been investigated. As such tools aim at steering further development and enabling a better understanding of computer vision problems, the possible existence of bias in their outcome can lead to a chain of biased decisions. In this paper, we explore the existence of bias in the outcome of explainability tools by investigating the use case of face presentation attack detection. By utilizing two different explainability tools on models with different levels of bias, we investigate the bias in the outcome of such tools. Our study shows that these tools show clear signs of gender bias in the quality of their explanations.
Abstract:Explainable Face Recognition is gaining growing attention as the use of the technology is gaining ground in security-critical applications. Understanding why two faces images are matched or not matched by a given face recognition system is important to operators, users, anddevelopers to increase trust, accountability, develop better systems, and highlight unfair behavior. In this work, we propose xSSAB, an approach to back-propagate similarity score-based arguments that support or oppose the face matching decision to visualize spatial maps that indicate similar and dissimilar areas as interpreted by the underlying FR model. Furthermore, we present Patch-LFW, a new explainable face verification benchmark that enables along with a novel evaluation protocol, the first quantitative evaluation of the validity of similarity and dissimilarity maps in explainable face recognition approaches. We compare our efficient approach to state-of-the-art approaches demonstrating a superior trade-off between efficiency and performance. The code as well as the proposed Patch-LFW is publicly available at: https://github.com/marcohuber/xSSAB.