Abstract:As deepfake technologies continue to advance, passive detection methods struggle to generalize with various forgery manipulations and datasets. Proactive defense techniques have been actively studied with the primary aim of preventing deepfake operation effectively working. In this paper, we aim to bridge the gap between passive detection and proactive defense, and seek to solve the detection problem utilizing a proactive methodology. Inspired by several watermarking-based forensic methods, we explore a novel detection framework based on the concept of ``hiding a learnable face within a face''. Specifically, relying on a semi-fragile invertible steganography network, a secret template image is embedded into a host image imperceptibly, acting as an indicator monitoring for any malicious image forgery when being restored by the inverse steganography process. Instead of being manually specified, the secret template is optimized during training to resemble a neutral facial appearance, just like a ``big brother'' hidden in the image to be protected. By incorporating a self-blending mechanism and robustness learning strategy with a simulative transmission channel, a robust detector is built to accurately distinguish if the steganographic image is maliciously tampered or benignly processed. Finally, extensive experiments conducted on multiple datasets demonstrate the superiority of the proposed approach over competing passive and proactive detection methods.
Abstract:The enhanced Gaussian noise (EGN) model, which accounts for inter-channel stimulated Raman scattering (ISRS), has been extensively utilized for evaluating nonlinear interference (NLI) within the C+L band. Compared to closed-form expressions and machine learning-based NLI evaluation models, it demonstrates broader applicability and its accuracy is not dependent on the support of large-scale datasets. However, its high computational complexity often results in lengthy computation times. Through analysis, the high-frequency oscillations of the four-wave mixing (FWM) efficiency factor integrand were identified as a primary factor limiting the computational speed of the ISRS EGN model. To address this issue, we propose an accurate approximation method that enables the derivation of a closed-form expression for the FWM efficiency factor without imposing restrictive conditions. Thereby, the scheme proposed in this paper could significantly accelerate the computational speed. Numerical results demonstrate that method in this work could achieve low error levels under high ISRS influence levels, with an MAE of less than 0.001 dB, and no cumulative error over increasing transmission distances, while reducing computation time by over 97%. Furthermore, a parallel computation strategy targeting independent regions within the integration domain is proposed, which could further improve computational efficiency by nearly 11 times.
Abstract:Advanced manipulation techniques have provided criminals with opportunities to make social panic or gain illicit profits through the generation of deceptive media, such as forged face images. In response, various deepfake detection methods have been proposed to assess image authenticity. Sequential deepfake detection, which is an extension of deepfake detection, aims to identify forged facial regions with the correct sequence for recovery. Nonetheless, due to the different combinations of spatial and sequential manipulations, forged face images exhibit substantial discrepancies that severely impact detection performance. Additionally, the recovery of forged images requires knowledge of the manipulation model to implement inverse transformations, which is difficult to ascertain as relevant techniques are often concealed by attackers. To address these issues, we propose Multi-Collaboration and Multi-Supervision Network (MMNet) that handles various spatial scales and sequential permutations in forged face images and achieve recovery without requiring knowledge of the corresponding manipulation method. Furthermore, existing evaluation metrics only consider detection accuracy at a single inferring step, without accounting for the matching degree with ground-truth under continuous multiple steps. To overcome this limitation, we propose a novel evaluation metric called Complete Sequence Matching (CSM), which considers the detection accuracy at multiple inferring steps, reflecting the ability to detect integrally forged sequences. Extensive experiments on several typical datasets demonstrate that MMNet achieves state-of-the-art detection performance and independent recovery performance.