Abstract:Generalizability to unseen forgery types is crucial for face forgery detectors. Recent works have made significant progress in terms of generalization by synthetic forgery data augmentation. In this work, we explore another path for improving the generalization. Our goal is to reduce the features that are easy to learn in the training phase, so as to reduce the risk of overfitting on specific forgery types. Specifically, in our method, a teacher network takes as input the face images and generates an attention map of the deep features by a diverse multihead attention ViT. The attention map is used to guide a student network to focus on the low-attended features by reducing the highly-attended deep features. A deep feature mixup strategy is also proposed to synthesize forgeries in the feature domain. Experiments demonstrate that, without data augmentation, our method is able to achieve promising performances on unseen forgeries and highly compressed data.
Abstract:In this work, we investigate improving the generalizability of GAN-generated image detectors by performing data augmentation in the fingerprint domain. Specifically, we first separate the fingerprints and contents of the GAN-generated images using an autoencoder based GAN fingerprint extractor, followed by random perturbations of the fingerprints. Then the original fingerprints are substituted with the perturbed fingerprints and added to the original contents, to produce images that are visually invariant but with distinct fingerprints. The perturbed images can successfully imitate images generated by different GANs to improve the generalization of the detectors, which is demonstrated by the spectra visualization. To our knowledge, we are the first to conduct data augmentation in the fingerprint domain. Our work explores a novel prospect that is distinct from previous works on spatial and frequency domain augmentation. Extensive cross-GAN experiments demonstrate the effectiveness of our method compared to the state-of-the-art methods in detecting fake images generated by unknown GANs.
Abstract:In this work, we propose a novel method to improve the generalization ability of CNN-based face forgery detectors. Our method considers the feature anomalies of forged faces caused by the prevalent blending operations in face forgery algorithms. Specifically, we propose a weakly supervised Second Order Local Anomaly (SOLA) learning module to mine anomalies in local regions using deep feature maps. SOLA first decomposes the neighborhood of local features by different directions and distances and then calculates the first and second order local anomaly maps which provide more general forgery traces for the classifier. We also propose a Local Enhancement Module (LEM) to improve the discrimination between local features of real and forged regions, so as to ensure accuracy in calculating anomalies. Besides, an improved Adaptive Spatial Rich Model (ASRM) is introduced to help mine subtle noise features via learnable high pass filters. With neither pixel level annotations nor external synthetic data, our method using a simple ResNet18 backbone achieves competitive performances compared with state-of-the-art works when evaluated on unseen forgeries.