Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuowen Hu

2D-3D Attention and Entropy for Pose Robust 2D Facial Recognition

May 14, 2025

J. Brennan Peace, Shuowen Hu, Benjamin S. Riggan

Abstract:Despite recent advances in facial recognition, there remains a fundamental issue concerning degradations in performance due to substantial perspective (pose) differences between enrollment and query (probe) imagery. Therefore, we propose a novel domain adaptive framework to facilitate improved performances across large discrepancies in pose by enabling image-based (2D) representations to infer properties of inherently pose invariant point cloud (3D) representations. Specifically, our proposed framework achieves better pose invariance by using (1) a shared (joint) attention mapping to emphasize common patterns that are most correlated between 2D facial images and 3D facial data and (2) a joint entropy regularizing loss to promote better consistency$\unicode{x2014}$enhancing correlations among the intersecting 2D and 3D representations$\unicode{x2014}$by leveraging both attention maps. This framework is evaluated on FaceScape and ARL-VTF datasets, where it outperforms competitive methods by achieving profile (90$\unicode{x00b0}$$\unicode{x002b}$) TAR @ 1$\unicode{x0025}$ FAR improvements of at least 7.1$\unicode{x0025}$ and 1.57$\unicode{x0025}$, respectively.

* To appear at the IEEE International Conference on Automatic Face and Gesture 2025 (FG2025)

Via

Access Paper or Ask Questions

A Brief Survey on Person Recognition at a Distance

Dec 17, 2022

Chrisopher B. Nalty, Neehar Peri, Joshua Gleason, Carlos D. Castillo, Shuowen Hu, Thirimachos Bourlai, Rama Chellappa

Abstract:Person recognition at a distance entails recognizing the identity of an individual appearing in images or videos collected by long-range imaging systems such as drones or surveillance cameras. Despite recent advances in deep convolutional neural networks (DCNNs), this remains challenging. Images or videos collected by long-range cameras often suffer from atmospheric turbulence, blur, low-resolution, unconstrained poses, and poor illumination. In this paper, we provide a brief survey of recent advances in person recognition at a distance. In particular, we review recent work in multi-spectral face verification, person re-identification, and gait-based analysis techniques. Furthermore, we discuss the merits and drawbacks of existing approaches and identify important, yet under explored challenges for deploying remote person recognition systems in-the-wild.

* This work has been accepted to the IEEE Asilomar Conference on Signals, Systems, and Computers (ACSSC) 2022

Via

Access Paper or Ask Questions

Learning Domain and Pose Invariance for Thermal-to-Visible Face Recognition

Nov 17, 2022

Cedric Nimpa Fondje, Shuowen Hu, Benjamin S. Riggan

Abstract:Interest in thermal to visible face recognition has grown significantly over the last decade due to advancements in thermal infrared cameras and analytics beyond the visible spectrum. Despite large discrepancies between thermal and visible spectra, existing approaches bridge domain gaps by either synthesizing visible faces from thermal faces or by learning the cross-spectrum image representations. These approaches typically work well with frontal facial imagery collected at varying ranges and expressions, but exhibit significantly reduced performance when matching thermal faces with varying poses to frontal visible faces. We propose a novel Domain and Pose Invariant Framework that simultaneously learns domain and pose invariant representations. Our proposed framework is composed of modified networks for extracting the most correlated intermediate representations from off-pose thermal and frontal visible face imagery, a sub-network to jointly bridge domain and pose gaps, and a joint-loss function comprised of cross-spectrum and pose-correction losses. We demonstrate efficacy and advantages of the proposed method by evaluating on three thermal-visible datasets: ARL Visible-to-Thermal Face, ARL Multimodal Face, and Tufts Face. Although DPIF focuses on learning to match off-pose thermal to frontal visible faces, we also show that DPIF enhances performance when matching frontal thermal face images to frontal visible face images.

Via

Access Paper or Ask Questions

Open-Set Automatic Target Recognition

Nov 10, 2022

Bardia Safaei, Vibashan VS, Celso M. de Melo, Shuowen Hu, Vishal M. Patel

Abstract:Automatic Target Recognition (ATR) is a category of computer vision algorithms which attempts to recognize targets on data obtained from different sensors. ATR algorithms are extensively used in real-world scenarios such as military and surveillance applications. Existing ATR algorithms are developed for traditional closed-set methods where training and testing have the same class distribution. Thus, these algorithms have not been robust to unknown classes not seen during the training phase, limiting their utility in real-world applications. To this end, we propose an Open-set Automatic Target Recognition framework where we enable open-set recognition capability for ATR algorithms. In addition, we introduce a plugin Category-aware Binary Classifier (CBC) module to effectively tackle unknown classes seen during inference. The proposed CBC module can be easily integrated with any existing ATR algorithms and can be trained in an end-to-end manner. Experimental results show that the proposed approach outperforms many open-set methods on the DSIAC and CIFAR-10 datasets. To the best of our knowledge, this is the first work to address the open-set classification problem for ATR algorithms. Source code is available at: https://github.com/bardisafa/Open-set-ATR.

* 5 pages, 3 figures. Submitted to ICASSP 2023

Via

Access Paper or Ask Questions

DefakeHop++: An Enhanced Lightweight Deepfake Detector

Apr 30, 2022

Hong-Shuo Chen, Shuowen Hu, Suya You, C. -C. Jay Kuo

Figure 1 for DefakeHop++: An Enhanced Lightweight Deepfake Detector

Figure 2 for DefakeHop++: An Enhanced Lightweight Deepfake Detector

Figure 3 for DefakeHop++: An Enhanced Lightweight Deepfake Detector

Figure 4 for DefakeHop++: An Enhanced Lightweight Deepfake Detector

Abstract:On the basis of DefakeHop, an enhanced lightweight Deepfake detector called DefakeHop++ is proposed in this work. The improvements lie in two areas. First, DefakeHop examines three facial regions (i.e., two eyes and mouth) while DefakeHop++ includes eight more landmarks for broader coverage. Second, for discriminant features selection, DefakeHop uses an unsupervised approach while DefakeHop++ adopts a more effective approach with supervision, called the Discriminant Feature Test (DFT). In DefakeHop++, rich spatial and spectral features are first derived from facial regions and landmarks automatically. Then, DFT is used to select a subset of discriminant features for classifier training. As compared with MobileNet v3 (a lightweight CNN model of 1.5M parameters targeting at mobile applications), DefakeHop++ has a model of 238K parameters, which is 16% of MobileNet v3. Furthermore, DefakeHop++ outperforms MobileNet v3 in Deepfake image detection performance in a weakly-supervised setting.

Via

Access Paper or Ask Questions

Geo-DefakeHop: High-Performance Geographic Fake Image Detection

Oct 19, 2021

Hong-Shuo Chen, Kaitai Zhang, Shuowen Hu, Suya You, C. -C. Jay Kuo

Figure 1 for Geo-DefakeHop: High-Performance Geographic Fake Image Detection

Figure 2 for Geo-DefakeHop: High-Performance Geographic Fake Image Detection

Figure 3 for Geo-DefakeHop: High-Performance Geographic Fake Image Detection

Figure 4 for Geo-DefakeHop: High-Performance Geographic Fake Image Detection

Abstract:A robust fake satellite image detection method, called Geo-DefakeHop, is proposed in this work. Geo-DefakeHop is developed based on the parallel subspace learning (PSL) methodology. PSL maps the input image space into several feature subspaces using multiple filter banks. By exploring response differences of different channels between real and fake images for a filter bank, Geo-DefakeHop learns the most discriminant channels and uses their soft decision scores as features. Then, Geo-DefakeHop selects a few discriminant features from each filter bank and ensemble them to make a final binary decision. Geo-DefakeHop offers a light-weight high-performance solution to fake satellite images detection. Its model size is analyzed, which ranges from 0.8 to 62K parameters. Furthermore, it is shown by experimental results that it achieves an F1-score higher than 95\% under various common image manipulations such as resizing, compression and noise corruption.

Via

Access Paper or Ask Questions

Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning

Oct 07, 2021

Vibashan VS, Domenick Poster, Suya You, Shuowen Hu, Vishal M. Patel

Figure 1 for Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning

Figure 2 for Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning

Figure 3 for Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning

Figure 4 for Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning

Abstract:Object detectors trained on large-scale RGB datasets are being extensively employed in real-world applications. However, these RGB-trained models suffer a performance drop under adverse illumination and lighting conditions. Infrared (IR) cameras are robust under such conditions and can be helpful in real-world applications. Though thermal cameras are widely used for military applications and increasingly for commercial applications, there is a lack of robust algorithms to robustly exploit the thermal imagery due to the limited availability of labeled thermal data. In this work, we aim to enhance the object detection performance in the thermal domain by leveraging the labeled visible domain data in an Unsupervised Domain Adaptation (UDA) setting. We propose an algorithm agnostic meta-learning framework to improve existing UDA methods instead of proposing a new UDA strategy. We achieve this by meta-learning the initial condition of the detector, which facilitates the adaptation process with fine updates without overfitting or getting stuck at local optima. However, meta-learning the initial condition for the detection scenario is computationally heavy due to long and intractable computation graphs. Therefore, we propose an online meta-learning paradigm which performs online updates resulting in a short and tractable computation graph. To this end, we demonstrate the superiority of our method over many baselines in the UDA setting, producing a state-of-the-art thermal detector for the KAIST and DSIAC datasets.

* Accepted to WACV 2022

Via

Access Paper or Ask Questions

Heterogeneous Face Frontalization via Domain Agnostic Learning

Jul 17, 2021

Xing Di, Shuowen Hu, Vishal M. Patel

Figure 1 for Heterogeneous Face Frontalization via Domain Agnostic Learning

Figure 2 for Heterogeneous Face Frontalization via Domain Agnostic Learning

Figure 3 for Heterogeneous Face Frontalization via Domain Agnostic Learning

Figure 4 for Heterogeneous Face Frontalization via Domain Agnostic Learning

Abstract:Recent advances in deep convolutional neural networks (DCNNs) have shown impressive performance improvements on thermal to visible face synthesis and matching problems. However, current DCNN-based synthesis models do not perform well on thermal faces with large pose variations. In order to deal with this problem, heterogeneous face frontalization methods are needed in which a model takes a thermal profile face image and generates a frontal visible face. This is an extremely difficult problem due to the large domain as well as large pose discrepancies between the two modalities. Despite its applications in biometrics and surveillance, this problem is relatively unexplored in the literature. We propose a domain agnostic learning-based generative adversarial network (DAL-GAN) which can synthesize frontal views in the visible domain from thermal faces with pose variations. DAL-GAN consists of a generator with an auxiliary classifier and two discriminators which capture both local and global texture discriminations for better synthesis. A contrastive constraint is enforced in the latent space of the generator with the help of a dual-path training strategy, which improves the feature vector discrimination. Finally, a multi-purpose loss function is utilized to guide the network in synthesizing identity preserving cross-domain frontalization. Extensive experimental results demonstrate that DAL-GAN can generate better quality frontal views compared to the other baseline methods.

* This work is accepted in IEEE conference on Automatic Face and Gesture Recognition 2021 (FG2021)

Via

Access Paper or Ask Questions

Simultaneous Face Hallucination and Translation for Thermal to Visible Face Verification using Axial-GAN

Apr 13, 2021

Rakhil Immidisetti, Shuowen Hu, Vishal M. Patel

Figure 1 for Simultaneous Face Hallucination and Translation for Thermal to Visible Face Verification using Axial-GAN

Figure 2 for Simultaneous Face Hallucination and Translation for Thermal to Visible Face Verification using Axial-GAN

Figure 3 for Simultaneous Face Hallucination and Translation for Thermal to Visible Face Verification using Axial-GAN

Figure 4 for Simultaneous Face Hallucination and Translation for Thermal to Visible Face Verification using Axial-GAN

Abstract:Existing thermal-to-visible face verification approaches expect the thermal and visible face images to be of similar resolution. This is unlikely in real-world long-range surveillance systems, since humans are distant from the cameras. To address this issue, we introduce the task of thermal-to-visible face verification from low-resolution thermal images. Furthermore, we propose Axial-Generative Adversarial Network (Axial-GAN) to synthesize high-resolution visible images for matching. In the proposed approach we augment the GAN framework with axial-attention layers which leverage the recent advances in transformers for modelling long-range dependencies. We demonstrate the effectiveness of the proposed method by evaluating on two different thermal-visible face datasets. When compared to related state-of-the-art works, our results show significant improvements in both image quality and face verification performance, and are also much more efficient.

Via

Access Paper or Ask Questions

DefakeHop: A Light-Weight High-Performance Deepfake Detector

Mar 11, 2021

Hong-Shuo Chen, Mozhdeh Rouhsedaghat, Hamza Ghani, Shuowen Hu, Suya You, C. -C. Jay Kuo

Figure 1 for DefakeHop: A Light-Weight High-Performance Deepfake Detector

Figure 2 for DefakeHop: A Light-Weight High-Performance Deepfake Detector

Figure 3 for DefakeHop: A Light-Weight High-Performance Deepfake Detector

Figure 4 for DefakeHop: A Light-Weight High-Performance Deepfake Detector

Abstract:A light-weight high-performance Deepfake detection method, called DefakeHop, is proposed in this work. State-of-the-art Deepfake detection methods are built upon deep neural networks. DefakeHop extracts features automatically using the successive subspace learning (SSL) principle from various parts of face images. The features are extracted by c/w Saab transform and further processed by our feature distillation module using spatial dimension reduction and soft classification for each channel to get a more concise description of the face. Extensive experiments are conducted to demonstrate the effectiveness of the proposed DefakeHop method. With a small model size of 42,845 parameters, DefakeHop achieves state-of-the-art performance with the area under the ROC curve (AUC) of 100%, 94.95%, and 90.56% on UADFV, Celeb-DF v1 and Celeb-DF v2 datasets, respectively.

* Accepted at ICME 2021

Via

Access Paper or Ask Questions