Abstract:The recent advancements in Generative Adversarial Networks (GANs) and the emergence of Diffusion models have significantly streamlined the production of highly realistic and widely accessible synthetic content. As a result, there is a pressing need for effective general purpose detection mechanisms to mitigate the potential risks posed by deepfakes. In this paper, we explore the effectiveness of pre-trained vision-language models (VLMs) when paired with recent adaptation methods for universal deepfake detection. Following previous studies in this domain, we employ only a single dataset (ProGAN) in order to adapt CLIP for deepfake detection. However, in contrast to prior research, which rely solely on the visual part of CLIP while ignoring its textual component, our analysis reveals that retaining the text part is crucial. Consequently, the simple and lightweight Prompt Tuning based adaptation strategy that we employ outperforms the previous SOTA approach by 5.01% mAP and 6.61% accuracy while utilizing less than one third of the training data (200k images as compared to 720k). To assess the real-world applicability of our proposed models, we conduct a comprehensive evaluation across various scenarios. This involves rigorous testing on images sourced from 21 distinct datasets, including those generated by GANs-based, Diffusion-based and Commercial tools.
Abstract:This paper investigates the use of computational tools and Open-Source Intelligence (OSINT) techniques for verifying online multimedia content, with a specific focus on real-world cases from the Russia-Ukraine conflict. Over a nine-month period from April to December 2022, we examine verification workflows, tools, and case studies published by \faktiskbar. Our study showcases the effectiveness of diverse resources, including AI tools, geolocation tools, internet archives, and social media monitoring platforms, in enabling journalists and fact-checkers to efficiently process and corroborate evidence, ensuring the dissemination of accurate information. This research underscores the vital role of computational tools and OSINT techniques in promoting evidence-based reporting and combatting misinformation. We also touch on the current limitations of available tools and prospects for future developments in multimedia verification.
Abstract:The growth of misinformation and re-contextualized media in social media and news leads to an increasing need for fact-checking methods. Concurrently, the advancement in generative models makes cheapfakes and deepfakes both easier to make and harder to detect. In this paper, we present a novel approach using generative image models to our advantage for detecting Out-of-Context (OOC) use of images-caption pairs in news. We present two new datasets with a total of $6800$ images generated using two different generative models including (1) DALL-E 2, and (2) Stable-Diffusion. We are confident that the method proposed in this paper can further research on generative models in the field of cheapfake detection, and that the resulting datasets can be used to train and evaluate new models aimed at detecting cheapfakes. We run a preliminary qualitative and quantitative analysis to evaluate the performance of each image generation model for this task, and evaluate a handful of methods for computing image similarity.
Abstract:This paper present a comprehensive comparative analysis of supervised and self-supervised models for deepfake detection. We evaluate eight supervised deep learning architectures and two transformer-based models pre-trained using self-supervised strategies (DINO, CLIP) on four benchmarks (FakeAVCeleb, CelebDF-V2, DFDC, and FaceForensics++). Our analysis includes intra-dataset and inter-dataset evaluations, examining the best performing models, generalisation capabilities, and impact of augmentations. We also investigate the trade-off between model size and performance. Our main goal is to provide insights into the effectiveness of different deep learning architectures (transformers, CNNs), training strategies (supervised, self-supervised), and deepfake detection benchmarks. These insights can help guide the development of more accurate and reliable deepfake detection systems, which are crucial in mitigating the harmful impact of deepfakes on individuals and society.
Abstract:Cheapfake is a recently coined term that encompasses non-AI ("cheap") manipulations of multimedia content. Cheapfakes are known to be more prevalent than deepfakes. Cheapfake media can be created using editing software for image/video manipulations, or even without using any software, by simply altering the context of an image/video by sharing the media alongside misleading claims. This alteration of context is referred to as out-of-context (OOC) misuse of media. OOC media is much harder to detect than fake media, since the images and videos are not tampered. In this challenge, we focus on detecting OOC images, and more specifically the misuse of real photographs with conflicting image captions in news items. The aim of this challenge is to develop and benchmark models that can be used to detect whether given samples (news image and associated captions) are OOC, based on the recently compiled COSMOS dataset.
Abstract:Deepfake media is becoming widespread nowadays because of the easily available tools and mobile apps which can generate realistic looking deepfake videos/images without requiring any technical knowledge. With further advances in this field of technology in the near future, the quantity and quality of deepfake media is also expected to flourish, while making deepfake media a likely new practical tool to spread mis/disinformation. Because of these concerns, the deepfake media detection tools are becoming a necessity. In this study, we propose a novel hybrid transformer network utilizing early feature fusion strategy for deepfake video detection. Our model employs two different CNN networks, i.e., (1) XceptionNet and (2) EfficientNet-B4 as feature extractors. We train both feature extractors along with the transformer in an end-to-end manner on FaceForensics++, DFDC benchmarks. Our model, while having relatively straightforward architecture, achieves comparable results to other more advanced state-of-the-art approaches when evaluated on FaceForensics++ and DFDC benchmarks. Besides this, we also propose novel face cut-out augmentations, as well as random cut-out augmentations. We show that the proposed augmentations improve the detection performance of our model and reduce overfitting. In addition to that, we show that our model is capable of learning from considerably small amount of data.
Abstract:Deepfakes are synthetically generated images, videos or audios, which fraudsters use to manipulate legitimate information. Current deepfake detection systems struggle against unseen data. To address this, we employ three different deep Convolutional Neural Network (CNN) models, (1) VGG16, (2) InceptionV3, and (3) XceptionNet to classify fake and real images extracted from videos. We also constructed a fusion of the deep CNN models to improve the robustness and generalisation capability. The proposed technique outperforms state-of-the-art models with 96.5% accuracy, when tested on publicly available DeepFake Detection Challenge (DFDC) test data, comprising of 400 videos. The fusion model achieves 99% accuracy on lower quality DeepFake-TIMIT dataset videos and 91.88% on higher quality DeepFake-TIMIT videos. In addition to this, we prove that prediction fusion is more robust against adversarial attacks. If one model is compromised by an adversarial attack, the prediction fusion does not let it affect the overall classification.
Abstract:Phishing attacks are the most common type of cyber-attacks used to obtain sensitive information and have been affecting individuals as well as organisations across the globe. Various techniques have been proposed to identify the phishing attacks specifically, deployment of machine intelligence in recent years. However, the deployed algorithms and discriminating factors are very diverse in existing works. In this study, we present a comprehensive analysis of various machine learning algorithms to evaluate their performances over multiple datasets. We further investigate the most significant features within multiple datasets and compare the classification performance with the reduced dimensional datasets. The statistical results indicate that random forest and artificial neural network outperform other classification algorithms, achieving over 97% accuracy using the identified features.
Abstract:Over 26 million cases have been confirmed worldwide (by 20 August 2020) since the Coronavirus disease (COIVD_19) outbreak in December 2019. Research studies have been addressing diverse aspects in relation to COVID_19 including potential symptoms, predictive tools and specifically, correlations with various demographic attributes. However, very limited work is performed towards the modelling of complex associations between the combined demographic attributes and varying nature of the COVID_19 infections across the globe. Investigating the underlying disease associations with the combined demographical characteristics might help in comprehensive analysis this devastating disease as well as contribute to its effective management. In this study, we present an intelligent model to investigate the multi-dimensional associations between the potentially relevant demographic attributes and the COVID_19 severity levels across the globe. We gather multiple demographic attributes and COVID_19 infection data (by 20 August 2020) from various reliable sources, which is then fed-into pattern matching algorithms that include self-organizing maps, class association rules and statistical approaches, to identify the significant associations within the processed dataset. Statistical results and the experts report indicate strong associations between the COVID_19 severity levels and measures of certain demographic attributes such as female smokers, when combined together with other attributes. These results strongly suggest that the mechanism underlying COVID_19 infection severity is associated to distribution of the certain demographic attributes within different regions of the world. The outcomes will aid the understanding of the dynamics of disease spread and its progression that might in turn help the policy makers and the society, in better understanding and management of the disease.