Abstract:This comprehensive survey serves as an indispensable resource for researchers embarking on the journey of fake news detection. By highlighting the pivotal role of dataset quality and diversity, it underscores the significance of these elements in the effectiveness and robustness of detection models. The survey meticulously outlines the key features of datasets, various labeling systems employed, and prevalent biases that can impact model performance. Additionally, it addresses critical ethical issues and best practices, offering a thorough overview of the current state of available datasets. Our contribution to this field is further enriched by the provision of GitHub repository, which consolidates publicly accessible datasets into a single, user-friendly portal. This repository is designed to facilitate and stimulate further research and development efforts aimed at combating the pervasive issue of fake news.
Abstract:This study addresses the critical challenge of detecting DeepFake tweets by leveraging advanced natural language processing (NLP) techniques to distinguish between genuine and AI-generated texts. Given the increasing prevalence of misinformation, our research utilizes the TweepFake dataset to train and evaluate various machine learning models. The objective is to identify effective strategies for recognizing DeepFake content, thereby enhancing the integrity of digital communications. By developing reliable methods for detecting AI-generated misinformation, this work contributes to a more trustworthy online information environment.