Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrick Flynn

Improved Ear Verification with Vision Transformers and Overlapping Patches

Mar 30, 2025

Deeksha Arun, Kagan Ozturk, Kevin W. Bowyer, Patrick Flynn

Abstract:Ear recognition has emerged as a promising biometric modality due to the relative stability in appearance during adulthood. Although Vision Transformers (ViTs) have been widely used in image recognition tasks, their efficiency in ear recognition has been hampered by a lack of attention to overlapping patches, which is crucial for capturing intricate ear features. In this study, we evaluate ViT-Tiny (ViT-T), ViT-Small (ViT-S), ViT-Base (ViT-B) and ViT-Large (ViT-L) configurations on a diverse set of datasets (OPIB, AWE, WPUT, and EarVN1.0), using an overlapping patch selection strategy. Results demonstrate the critical importance of overlapping patches, yielding superior performance in 44 of 48 experiments in a structured study. Moreover, upon comparing the results of the overlapping patches with the non-overlapping configurations, the increase is significant, reaching up to 10% for the EarVN1.0 dataset. In terms of model performance, the ViT-T model consistently outperformed the ViT-S, ViT-B, and ViT-L models on the AWE, WPUT, and EarVN1.0 datasets. The highest scores were achieved in a configuration with a patch size of 28x28 and a stride of 14 pixels. This patch-stride configuration represents 25% of the normalized image area (112x112 pixels) for the patch size and 12.5% of the row or column size for the stride. This study confirms that transformer architectures with overlapping patch selection can serve as an efficient and high-performing option for ear-based biometric recognition tasks in verification scenarios.

Via

Access Paper or Ask Questions

SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic Signals

Apr 20, 2024

Jeremy Speth, Nathan Vance, Patrick Flynn, Adam Czajka

Figure 1 for SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic Signals

Figure 2 for SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic Signals

Figure 3 for SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic Signals

Figure 4 for SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic Signals

Abstract:Subtle periodic signals, such as blood volume pulse and respiration, can be extracted from RGB video, enabling noncontact health monitoring at low cost. Advancements in remote pulse estimation -- or remote photoplethysmography (rPPG) -- are currently driven by deep learning solutions. However, modern approaches are trained and evaluated on benchmark datasets with ground truth from contact-PPG sensors. We present the first non-contrastive unsupervised learning framework for signal regression to mitigate the need for labelled video data. With minimal assumptions of periodicity and finite bandwidth, our approach discovers the blood volume pulse directly from unlabelled videos. We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning visual features of periodic signals. We perform the first experiments utilizing unlabelled video data not specifically created for rPPG to train robust pulse rate estimators. Given the limited inductive biases, we successfully applied the same approach to camera-based respiration by changing the bandlimits of the target signal. This shows that the approach is general enough for unsupervised learning of bandlimited quasi-periodic signals from different domains. Furthermore, we show that the framework is effective for finetuning models on unlabelled video from a single subject, allowing for personalized and adaptive signal regressors.

* Extension of CVPR2023 highlight paper. arXiv admin note: substantial text overlap with arXiv:2303.07944

Via

Access Paper or Ask Questions

Measuring Domain Shifts using Deep Learning Remote Photoplethysmography Model Similarity

Apr 12, 2024

Nathan Vance, Patrick Flynn

Abstract:Domain shift differences between training data for deep learning models and the deployment context can result in severe performance issues for models which fail to generalize. We study the domain shift problem under the context of remote photoplethysmography (rPPG), a technique for video-based heart rate inference. We propose metrics based on model similarity which may be used as a measure of domain shift, and we demonstrate high correlation between these metrics and empirical performance. One of the proposed metrics with viable correlations, DS-diff, does not assume access to the ground truth of the target domain, i.e. it may be applied to in-the-wild data. To that end, we investigate a model selection problem in which ground truth results for the evaluation domain is not known, demonstrating a 13.9% performance improvement over the average case baseline.

Via

Access Paper or Ask Questions

MSPM: A Multi-Site Physiological Monitoring Dataset for Remote Pulse, Respiration, and Blood Pressure Estimation

Feb 03, 2024

Jeremy Speth, Nathan Vance, Benjamin Sporrer, Lu Niu, Patrick Flynn, Adam Czajka

Abstract:Visible-light cameras can capture subtle physiological biomarkers without physical contact with the subject. We present the Multi-Site Physiological Monitoring (MSPM) dataset, which is the first dataset collected to support the study of simultaneous camera-based vital signs estimation from multiple locations on the body. MSPM enables research on remote photoplethysmography (rPPG), respiration rate, and pulse transit time (PTT); it contains ground-truth measurements of pulse oximetry (at multiple body locations) and blood pressure using contacting sensors. We provide thorough experiments demonstrating the suitability of MSPM to support research on rPPG, respiration rate, and PTT. Cross-dataset rPPG experiments reveal that MSPM is a challenging yet high quality dataset, with intra-dataset pulse rate mean absolute error (MAE) below 4 beats per minute (BPM), and cross-dataset pulse rate MAE below 2 BPM in certain cases. Respiration experiments find a MAE of 1.09 breaths per minute by extracting motion features from the chest. PTT experiments find that across the pairs of different body sites, there is high correlation between remote PTT and contact-measured PTT, which facilitates the possibility for future camera-based PTT research.

Via

Access Paper or Ask Questions

Refining Remote Photoplethysmography Architectures using CKA and Empirical Methods

Jan 09, 2024

Nathan Vance, Patrick Flynn

Abstract:Model architecture refinement is a challenging task in deep learning research fields such as remote photoplethysmography (rPPG). One architectural consideration, the depth of the model, can have significant consequences on the resulting performance. In rPPG models that are overprovisioned with more layers than necessary, redundancies exist, the removal of which can result in faster training and reduced computational load at inference time. With too few layers the models may exhibit sub-optimal error rates. We apply Centered Kernel Alignment (CKA) to an array of rPPG architectures of differing depths, demonstrating that shallower models do not learn the same representations as deeper models, and that after a certain depth, redundant layers are added without significantly increased functionality. An empirical study confirms these findings and shows how this method could be used to refine rPPG architectures.

Via

Access Paper or Ask Questions

EyePreserve: Identity-Preserving Iris Synthesis

Dec 30, 2023

Siamul Karim Khan, Patrick Tinsley, Mahsa Mitcheff, Patrick Flynn, Kevin W. Bowyer, Adam Czajka

Figure 1 for EyePreserve: Identity-Preserving Iris Synthesis

Figure 2 for EyePreserve: Identity-Preserving Iris Synthesis

Figure 3 for EyePreserve: Identity-Preserving Iris Synthesis

Figure 4 for EyePreserve: Identity-Preserving Iris Synthesis

Abstract:Synthesis of same-identity biometric iris images, both for existing and non-existing identities while preserving the identity across a wide range of pupil sizes, is complex due to intricate iris muscle constriction mechanism, requiring a precise model of iris non-linear texture deformations to be embedded into the synthesis pipeline. This paper presents the first method of fully data-driven, identity-preserving, pupil size-varying s ynthesis of iris images. This approach is capable of synthesizing images of irises with different pupil sizes representing non-existing identities as well as non-linearly deforming the texture of iris images of existing subjects given the segmentation mask of the target iris image. Iris recognition experiments suggest that the proposed deformation model not only preserves the identity when changing the pupil size but offers better similarity between same-identity iris samples with significant differences in pupil size, compared to state-of-the-art linear and non-linear (bio-mechanical-based) iris deformation models. Two immediate applications of the proposed approach are: (a) synthesis of, or enhancement of the existing biometric datasets for iris recognition, mimicking those acquired with iris sensors, and (b) helping forensic human experts in examining iris image pairs with significant differences in pupil dilation. Source codes and weights of the models are made available with the paper.

Via

Access Paper or Ask Questions

Iris Liveness Detection Competition (LivDet-Iris) -- The 2023 Edition

Oct 06, 2023

Patrick Tinsley, Sandip Purnapatra, Mahsa Mitcheff, Aidan Boyd, Colton Crum, Kevin Bowyer, Patrick Flynn, Stephanie Schuckers, Adam Czajka, Meiling Fang(+11 more)

Figure 1 for Iris Liveness Detection Competition (LivDet-Iris) -- The 2023 Edition

Figure 2 for Iris Liveness Detection Competition (LivDet-Iris) -- The 2023 Edition

Figure 3 for Iris Liveness Detection Competition (LivDet-Iris) -- The 2023 Edition

Figure 4 for Iris Liveness Detection Competition (LivDet-Iris) -- The 2023 Edition

Abstract:This paper describes the results of the 2023 edition of the ''LivDet'' series of iris presentation attack detection (PAD) competitions. New elements in this fifth competition include (1) GAN-generated iris images as a category of presentation attack instruments (PAI), and (2) an evaluation of human accuracy at detecting PAI as a reference benchmark. Clarkson University and the University of Notre Dame contributed image datasets for the competition, composed of samples representing seven different PAI categories, as well as baseline PAD algorithms. Fraunhofer IGD, Beijing University of Civil Engineering and Architecture, and Hochschule Darmstadt contributed results for a total of eight PAD algorithms to the competition. Accuracy results are analyzed by different PAI types, and compared to human accuracy. Overall, the Fraunhofer IGD algorithm, using an attention-based pixel-wise binary supervision network, showed the best-weighted accuracy results (average classification error rate of 37.31%), while the Beijing University of Civil Engineering and Architecture's algorithm won when equal weights for each PAI were given (average classification rate of 22.15%). These results suggest that iris PAD is still a challenging problem.

* 8 pages, IJCB 2023

Via

Access Paper or Ask Questions

Promoting Generalization in Cross-Dataset Remote Photoplethysmography

May 24, 2023

Nathan Vance, Jeremy Speth, Benjamin Sporrer, Patrick Flynn

Figure 1 for Promoting Generalization in Cross-Dataset Remote Photoplethysmography

Figure 2 for Promoting Generalization in Cross-Dataset Remote Photoplethysmography

Figure 3 for Promoting Generalization in Cross-Dataset Remote Photoplethysmography

Figure 4 for Promoting Generalization in Cross-Dataset Remote Photoplethysmography

Abstract:Remote Photoplethysmography (rPPG), or the remote monitoring of a subject's heart rate using a camera, has seen a shift from handcrafted techniques to deep learning models. While current solutions offer substantial performance gains, we show that these models tend to learn a bias to pulse wave features inherent to the training dataset. We develop augmentations to mitigate this learned bias by expanding both the range and variability of heart rates that the model sees while training, resulting in improved model convergence when training and cross-dataset generalization at test time. Through a 3-way cross dataset analysis we demonstrate a reduction in mean absolute error from over 13 beats per minute to below 3 beats per minute. We compare our method with other recent rPPG systems, finding similar performance under a variety of evaluation parameters.

* 8 pages, accepted for publication at CVPM 2023

Via

Access Paper or Ask Questions

Full-Body Cardiovascular Sensing with Remote Photoplethysmography

Mar 16, 2023

Lu Niu, Jeremy Speth, Nathan Vance, Ben Sporrer, Adam Czajka, Patrick Flynn

Figure 1 for Full-Body Cardiovascular Sensing with Remote Photoplethysmography

Figure 2 for Full-Body Cardiovascular Sensing with Remote Photoplethysmography

Figure 3 for Full-Body Cardiovascular Sensing with Remote Photoplethysmography

Figure 4 for Full-Body Cardiovascular Sensing with Remote Photoplethysmography

Abstract:Remote photoplethysmography (rPPG) allows for noncontact monitoring of blood volume changes from a camera by detecting minor fluctuations in reflected light. Prior applications of rPPG focused on face videos. In this paper we explored the feasibility of rPPG from non-face body regions such as the arms, legs, and hands. We collected a new dataset titled Multi-Site Physiological Monitoring (MSPM), which will be released with this paper. The dataset consists of 90 frames per second video of exposed arms, legs, and face, along with 10 synchronized PPG recordings. We performed baseline heart rate estimation experiments from non-face regions with several state-of-the-art rPPG approaches, including chrominance-based (CHROM), plane-orthogonal-to-skin (POS) and RemotePulseNet (RPNet). To our knowledge, this is the first evaluation of the fidelity of rPPG signals simultaneously obtained from multiple regions of a human body. Our experiments showed that skin pixels from arms, legs, and hands are all potential sources of the blood volume pulse. The best-performing approach, POS, achieved a mean absolute error peaking at 7.11 beats per minute from non-facial body parts compared to 1.38 beats per minute from the face. Additionally, we performed experiments on pulse transit time (PTT) from both the contact PPG and rPPG signals. We found that remote PTT is possible with moderately high frame rate video when distal locations on the body are visible. These findings and the supporting dataset should facilitate new research on non-face rPPG and monitoring blood flow dynamics over the whole body with a camera.

Via

Access Paper or Ask Questions

Non-Contrastive Unsupervised Learning of Physiological Signals from Video

Mar 14, 2023

Jeremy Speth, Nathan Vance, Patrick Flynn, Adam Czajka

Abstract:Subtle periodic signals such as blood volume pulse and respiration can be extracted from RGB video, enabling remote health monitoring at low cost. Advancements in remote pulse estimation -- or remote photoplethysmography (rPPG) -- are currently driven by deep learning solutions. However, modern approaches are trained and evaluated on benchmark datasets with associated ground truth from contact-PPG sensors. We present the first non-contrastive unsupervised learning framework for signal regression to break free from the constraints of labelled video data. With minimal assumptions of periodicity and finite bandwidth, our approach is capable of discovering the blood volume pulse directly from unlabelled videos. We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning visual features of periodic signals. We perform the first experiments utilizing unlabelled video data not specifically created for rPPG to train robust pulse rate estimators. Given the limited inductive biases and impressive empirical results, the approach is theoretically capable of discovering other periodic signals from video, enabling multiple physiological measurements without the need for ground truth signals. Codes to fully reproduce the experiments are made available along with the paper.

* Accepted to CVPR 2023

Via

Access Paper or Ask Questions