Abstract:The impact of voice disorders is becoming more widely acknowledged as a public health issue. Several machine learning-based classifiers with the potential to identify disorders have been used in recent studies to differentiate between normal and pathological voices and sounds. In this paper, we focus on analyzing the vulnerabilities of these systems by exploring the possibility of attacks that can reverse classification and compromise their reliability. Given the critical nature of personal health information, understanding which types of attacks are effective is a necessary first step toward improving the security of such systems. Starting from the original audios, we implement various attack methods, including adversarial, evasion, and pitching techniques, and evaluate how state-of-the-art disorder detection models respond to them. Our findings identify the most effective attack strategies, underscoring the need to address these vulnerabilities in machine-learning systems used in the healthcare domain.
Abstract:3D face reconstruction (3DFR) algorithms are based on specific assumptions tailored to distinct application scenarios. These assumptions limit their use when acquisition conditions, such as the subject's distance from the camera or the camera's characteristics, are different than expected, as typically happens in video surveillance. Additionally, 3DFR algorithms follow various strategies to address the reconstruction of a 3D shape from 2D data, such as statistical model fitting, photometric stereo, or deep learning. In the present study, we explore the application of three 3DFR algorithms representative of the SOTA, employing each one as the template set generator for a face verification system. The scores provided by each system are combined by score-level fusion. We show that the complementarity induced by different 3DFR algorithms improves performance when tests are conducted at never-seen-before distances from the camera and camera characteristics (cross-distance and cross-camera settings), thus encouraging further investigations on multiple 3DFR-based approaches.
Abstract:Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns. With the recent advances in generative models, recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets. This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024) and established to investigate the use of synthetic data for training face recognition models. The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones. In the first task, the face recognition backbone was fixed and the dataset size was limited, while the second task provided almost complete freedom on the model backbone, the dataset, and the training pipeline. The submitted models were trained on existing and also new synthetic datasets and used clever methods to improve training with synthetic data. The submissions were evaluated and ranked on a diverse set of seven benchmarking datasets. The paper gives an overview of the submitted face recognition models and reports achieved performance compared to baseline models trained on real and synthetic datasets. Furthermore, the evaluation of submissions is extended to bias assessment across different demography groups. Lastly, an outlook on the current state of the research in training face recognition models using synthetic data is presented, and existing problems as well as potential future directions are also discussed.
Abstract:The International Fingerprint Liveness Detection Competition (LivDet) is a biennial event that invites academic and industry participants to prove their advancements in Fingerprint Presentation Attack Detection (PAD). This edition, LivDet2023, proposed two challenges, Liveness Detection in Action and Fingerprint Representation, to evaluate the efficacy of PAD embedded in verification systems and the effectiveness and compactness of feature sets. A third, hidden challenge is the inclusion of two subsets in the training set whose sensor information is unknown, testing participants ability to generalize their models. Only bona fide fingerprint samples were provided to participants, and the competition reports and assesses the performance of their algorithms suffering from this limitation in data availability.
Abstract:3D face reconstruction algorithms from images and videos are applied to many fields, from plastic surgery to the entertainment sector, thanks to their advantageous features. However, when looking at forensic applications, 3D face reconstruction must observe strict requirements that still make its possible role in bringing evidence to a lawsuit unclear. An extensive investigation of the constraints, potential, and limits of its application in forensics is still missing. Shedding some light on this matter is the goal of the present survey, which starts by clarifying the relation between forensic applications and biometrics, with a focus on face recognition. Therefore, it provides an analysis of the achievements of 3D face reconstruction algorithms from surveillance videos and mugshot images and discusses the current obstacles that separate 3D face reconstruction from an active role in forensic applications. Finally, it examines the underlying data sets, with their advantages and limitations, while proposing alternatives that could substitute or complement them.
Abstract:Fingerprint authentication systems are highly vulnerable to artificial reproductions of fingerprint, called fingerprint presentation attacks. Detecting presentation attacks is not trivial because attackers refine their replication techniques from year to year. The International Fingerprint liveness Detection Competition (LivDet), an open and well-acknowledged meeting point of academies and private companies that deal with the problem of presentation attack detection, has the goal to assess the performance of fingerprint presentation attack detection (FPAD) algorithms by using standard experimental protocols and data sets. Each LivDet edition, held biannually since 2009, is characterized by a different set of challenges against which competitors must be dealt with. The continuous increase of competitors and the noticeable decrease in error rates across competitions demonstrate a growing interest in the topic. This paper reviews the LivDet editions from 2009 to 2021 and points out their evolution over the years.
Abstract:The diffusion of fingerprint verification systems for security applications makes it urgent to investigate the embedding of software-based presentation attack detection algorithms (PAD) into such systems. Companies and institutions need to know whether such integration would make the system more "secure" and whether the technology available is ready, and, if so, at what operational working conditions. Despite significant improvements, especially by adopting deep learning approaches to fingerprint PAD, current research did not state much about their effectiveness when embedded in fingerprint verification systems. We believe that the lack of works is explained by the lack of instruments to investigate the problem, that is, modeling the cause-effect relationships when two non-zero error-free systems work together. Accordingly, this paper explores the fusion of PAD into verification systems by proposing a novel investigation instrument: a performance simulator based on the probabilistic modeling of the relationships among the Receiver Operating Characteristics (ROC) of the two individual systems when PAD and verification stages are implemented sequentially. As a matter of fact, this is the most straightforward, flexible, and widespread approach. We carry out simulations on the PAD algorithms' ROCs submitted to the most recent editions of LivDet (2017-2019), the state-of-the-art NIST Bozorth3, and the top-level Veryfinger 12 matchers. Reported experiments explore significant scenarios to get the conditions under which fingerprint matching with embedded PAD can improve, rather than degrade, the overall personal verification performance.
Abstract:The International Fingerprint Liveness Detection Competition is an international biennial competition open to academia and industry with the aim to assess and report advances in Fingerprint Presentation Attack Detection. The proposed "Liveness Detection in Action" and "Fingerprint representation" challenges were aimed to evaluate the impact of a PAD embedded into a verification system, and the effectiveness and compactness of feature sets for mobile applications. Furthermore, we experimented a new spoof fabrication method that has particularly affected the final results. Twenty-three algorithms were submitted to the competition, the maximum number ever achieved by LivDet.
Abstract:In this study we investigate a textural processing method of electroencephalography (EEG) signal as an indicator to estimate the driver's vigilance in a hypothetical Brain-Computer Interface (BCI) system. The novelty of the solution proposed relies on employing the one-dimensional Local Binary Pattern (1D-LBP) algorithm for feature extraction from pre-processed EEG data. From the resulting feature vector, the classification is done according to three vigilance classes: awake, tired and drowsy. The claim is that the class transitions can be detected by describing the variations of the micro-patterns' occurrences along the EEG signal. The 1D-LBP is able to describe them by detecting mutual variations of the signal temporarily "close" as a short bit-code. Our analysis allows to conclude that the 1D-LBP adoption has led to significant performance improvement. Moreover, capturing the class transitions from the EEG signal is effective, although the overall performance is not yet good enough to develop a BCI for assessing the driver's vigilance in real environments.
Abstract:We present a novel descriptor for crowd behavior analysis and anomaly detection. The goal is to measure by appropriate patterns the speed of formation and disintegration of groups in the crowd. This descriptor is inspired by the concept of one-dimensional local binary patterns: in our case, such patterns depend on the number of group observed in a time window. An appropriate measurement unit, named "trit" (trinary digit), represents three possible dynamic states of groups on a certain frame. Our hypothesis is that abrupt variations of the groups' number may be due to an anomalous event that can be accordingly detected, by translating these variations on temporal trit-based sequence of strings which are significantly different from the one describing the "no-anomaly" one. Due to the peculiarity of the rationale behind this work, relying on the number of groups, three different methods of people group's extraction are compared. Experiments are carried out on the Motion-Emotion benchmark data set. Reported results point out in which cases the trit-based measurement of group dynamics allows us to detect the anomaly. Besides the promising performance of our approach, we show how it is correlated with the anomaly typology and the camera's perspective to the crowd's flow (frontal, lateral).