Abstract:The assessment of iris uniqueness plays a crucial role in analyzing the capabilities and limitations of iris recognition systems. Among the various methodologies proposed, Daugman's approach to iris uniqueness stands out as one of the most widely accepted. According to Daugman, uniqueness refers to the iris recognition system's ability to enroll an increasing number of classes while maintaining a near-zero probability of collision between new and enrolled classes. Daugman's approach involves creating distinct IrisCode templates for each iris class within the system and evaluating the sustainable population under a fixed Hamming distance between codewords. In our previous work [23], we utilized Rate-Distortion Theory (as it pertains to the limits of error-correction codes) to establish boundaries for the maximum possible population of iris classes supported by Daugman's IrisCode, given the constraint of a fixed Hamming distance between codewords. Building upon that research, we propose a novel methodology to evaluate the scalability of an iris recognition system, while also measuring iris quality. We achieve this by employing a sphere-packing bound for Gaussian codewords and adopting a approach similar to Daugman's, which utilizes relative entropy as a distance measure between iris classes. To demonstrate the efficacy of our methodology, we illustrate its application on two small datasets of iris images. We determine the sustainable maximum population for each dataset based on the quality of the images. By providing these illustrations, we aim to assist researchers in comprehending the limitations inherent in their recognition systems, depending on the quality of their iris databases.
Abstract:Ground assets deployed in a cluttered environment with randomized obstacles (e.g., a forest) may experience line of sight (LoS) obstruction due to those obstacles. Air assets can be deployed in the vicinity to aid the communication by establishing two-hop paths between the ground assets. Obstacles that are taller than a position-dependent critical height may still obstruct the LoS between a ground asset and an air asset. In this paper, we provide an analytical framework for computing the probability of obtaining a LoS path in a Poisson forest. Given the locations and heights of a ground asset and an air asset, we establish the critical height, which is a function of distance. To account for this dependence on distance, the blocking is modeled as an inhomogenous Poisson point process, and the LoS probability is its void probability. Examples and closed-form expressions are provided for two obstruction height distributions: uniform and truncated Gaussian. The examples are validated through simulation. Additionally, the end-to-end throughput is determined and shown to be a metric that balances communication distance with the impact of LoS blockage. Throughput is used to determine the range at which it is better to relay communications through the air asset, and, when the air asset is deployed, its optimal height.
Abstract:In recent years, periocular recognition has been developed as a valuable biometric identification approach, especially in wild environments (for example, masked faces due to COVID-19 pandemic) where facial recognition may not be applicable. This paper presents a new deep periocular recognition framework called attribute-based deep periocular recognition (ADPR), which predicts soft biometrics and incorporates the prediction into a periocular recognition algorithm to determine identity from periocular images with high accuracy. We propose an end-to-end framework, which uses several shared convolutional neural network (CNN)layers (a common network) whose output feeds two separate dedicated branches (modality dedicated layers); the first branch classifies periocular images while the second branch predicts softn biometrics. Next, the features from these two branches are fused together for a final periocular recognition. The proposed method is different from existing methods as it not only uses a shared CNN feature space to train these two tasks jointly, but it also fuses predicted soft biometric features with the periocular features in the training step to improve the overall periocular recognition performance. Our proposed model is extensively evaluated using four different publicly available datasets. Experimental results indicate that our soft biometric based periocular recognition approach outperforms other state-of-the-art methods for periocular recognition in wild environments.
Abstract:In recent years, with the advent of deep-learning, face recognition has achieved exceptional success. However, many of these deep face recognition models perform much better in handling frontal faces compared to profile faces. The major reason for poor performance in handling of profile faces is that it is inherently difficult to learn pose-invariant deep representations that are useful for profile face recognition. In this paper, we hypothesize that the profile face domain possesses a latent connection with the frontal face domain in a latent feature subspace. We look to exploit this latent connection by projecting the profile faces and frontal faces into a common latent subspace and perform verification or retrieval in the latent domain. We leverage a coupled conditional generative adversarial network (cpGAN) structure to find the hidden relationship between the profile and frontal images in a latent common embedding subspace. Specifically, the cpGAN framework consists of two conditional GAN-based sub-networks, one dedicated to the frontal domain and the other dedicated to the profile domain. Each sub-network tends to find a projection that maximizes the pair-wise correlation between the two feature domains in a common embedding feature subspace. The efficacy of our approach compared with the state-of-the-art is demonstrated using the CFP, CMU Multi-PIE, IJB-A, and IJB-C datasets. Additionally, we have also implemented a coupled convolutional neural network (cpCNN) and an adversarial discriminative domain adaptation network (ADDA) for profile to frontal face recognition. We have evaluated the performance of cpCNN and ADDA and compared it with the proposed cpGAN. Finally, we have also evaluated our cpGAN for reconstruction of frontal faces from input profile faces contained in the VGGFace2 dataset.
Abstract:In recent years, due to the emergence of deep learning, face recognition has achieved exceptional success. However, many of these deep face recognition models perform relatively poorly in handling profile faces compared to frontal faces. The major reason for this poor performance is that it is inherently difficult to learn large pose invariant deep representations that are useful for profile face recognition. In this paper, we hypothesize that the profile face domain possesses a gradual connection with the frontal face domain in the deep feature space. We look to exploit this connection by projecting the profile faces and frontal faces into a common latent space and perform verification or retrieval in the latent domain. We leverage a coupled generative adversarial network (cpGAN) structure to find the hidden relationship between the profile and frontal images in a latent common embedding subspace. Specifically, the cpGAN framework consists of two GAN-based sub-networks, one dedicated to the frontal domain and the other dedicated to the profile domain. Each sub-network tends to find a projection that maximizes the pair-wise correlation between two feature domains in a common embedding feature subspace. The efficacy of our approach compared with the state-of-the-art is demonstrated using the CFP, CMU MultiPIE, IJB-A, and IJB-C datasets.
Abstract:Cross-modal hashing facilitates mapping of heterogeneous multimedia data into a common Hamming space, which can beutilized for fast and flexible retrieval across different modalities. In this paper, we propose a novel cross-modal hashingarchitecture-deep neural decoder cross-modal hashing (DNDCMH), which uses a binary vector specifying the presence of certainfacial attributes as an input query to retrieve relevant face images from a database. The DNDCMH network consists of two separatecomponents: an attribute-based deep cross-modal hashing (ADCMH) module, which uses a margin (m)-based loss function toefficiently learn compact binary codes to preserve similarity between modalities in the Hamming space, and a neural error correctingdecoder (NECD), which is an error correcting decoder implemented with a neural network. The goal of NECD network in DNDCMH isto error correct the hash codes generated by ADCMH to improve the retrieval efficiency. The NECD network is trained such that it hasan error correcting capability greater than or equal to the margin (m) of the margin-based loss function. This results in NECD cancorrect the corrupted hash codes generated by ADCMH up to the Hamming distance of m. We have evaluated and comparedDNDCMH with state-of-the-art cross-modal hashing methods on standard datasets to demonstrate the superiority of our method.
Abstract:In this paper, we present a novel architecture that integrates a deep hashing framework with a neural network decoder (NND) for application to face template protection. It improves upon existing face template protection techniques to provide better matching performance with one-shot and multi-shot enrollment. A key novelty of our proposed architecture is that the framework can also be used with zero-shot enrollment. This implies that our architecture does not need to be re-trained even if a new subject is to be enrolled into the system. The proposed architecture consists of two major components: a deep hashing (DH) component, which is used for robust mapping of face images to their corresponding intermediate binary codes, and a NND component, which corrects errors in the intermediate binary codes that are caused by differences in the enrollment and probe biometrics due to factors such as variation in pose, illumination, and other factors. The final binary code generated by the NND is then cryptographically hashed and stored as a secure face template in the database. The efficacy of our approach with zero-shot, one-shot, and multi-shot enrollments is shown for CMU-PIE, Extended Yale B, WVU multimodal and Multi-PIE face databases. With zero-shot enrollment, the system achieves approximately 85% genuine accept rates (GAR) at 0.01% false accept rate (FAR), and with one-shot and multi-shot enrollments, it achieves approximately 99.95% GAR at 0.01% FAR, while providing a high level of template security.
Abstract:In this paper, we propose a novel multimodal deep hashing neural decoder (MDHND) architecture, which integrates a deep hashing framework with a neural network decoder (NND) to create an effective multibiometric authentication system. The MDHND consists of two separate modules: a multimodal deep hashing (MDH) module, which is used for feature-level fusion and binarization of multiple biometrics, and a neural network decoder (NND) module, which is used to refine the intermediate binary codes generated by the MDH and compensate for the difference between enrollment and probe biometrics (variations in pose, illumination, etc.). Use of NND helps to improve the performance of the overall multimodal authentication system. The MDHND framework is trained in 3 steps using joint optimization of the two modules. In Step 1, the MDH parameters are trained and learned to generate a shared multimodal latent code; in Step 2, the latent codes from Step 1 are passed through a conventional error-correcting code (ECC) decoder to generate the ground truth to train a neural network decoder (NND); in Step 3, the NND decoder is trained using the ground truth from Step 2 and the MDH and NND are jointly optimized. Experimental results on a standard multimodal dataset demonstrate the superiority of our method relative to other current multimodal authentication systems
Abstract:With benefits of fast query speed and low storage cost, hashing-based image retrieval approaches have garnered considerable attention from the research community. In this paper, we propose a novel Error-Corrected Deep Cross Modal Hashing (CMH-ECC) method which uses a bitmap specifying the presence of certain facial attributes as an input query to retrieve relevant face images from the database. In this architecture, we generate compact hash codes using an end-to-end deep learning module, which effectively captures the inherent relationships between the face and attribute modality. We also integrate our deep learning module with forward error correction codes to further reduce the distance between different modalities of the same subject. Specifically, the properties of deep hashing and forward error correction codes are exploited to design a cross modal hashing framework with high retrieval performance. Experimental results using two standard datasets with facial attributes-image modalities indicate that our CMH-ECC face image retrieval model outperforms most of the current attribute-based face image retrieval approaches.
Abstract:Biometric recognition, or simply biometrics, is the use of biological attributes such as face, fingerprints or iris in order to recognize an individual in an automated manner. A key application of biometrics is authentication; i.e., using said biological attributes to provide access by verifying the claimed identity of an individual. This paper presents a framework for Biometrics-as-a-Service (BaaS) that performs biometric matching operations in the cloud, while relying on simple and ubiquitous consumer devices such as smartphones. Further, the framework promotes innovation by providing interfaces for a plurality of software developers to upload their matching algorithms to the cloud. When a biometric authentication request is submitted, the system uses a criteria to automatically select an appropriate matching algorithm. Every time a particular algorithm is selected, the corresponding developer is rendered a micropayment. This creates an innovative and competitive ecosystem that benefits both software developers and the consumers. As a case study, we have implemented the following: (a) an ocular recognition system using a mobile web interface providing user access to a biometric authentication service, and (b) a Linux-based virtual machine environment used by software developers for algorithm development and submission.