Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Panagiotis C. Petrantonakis

Composite Data Augmentations for Synthetic Image Detection Against Real-World Perturbations

Jun 13, 2025

Efthymia Amarantidou, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Abstract:The advent of accessible Generative AI tools enables anyone to create and spread synthetic images on social media, often with the intention to mislead, thus posing a significant threat to online information integrity. Most existing Synthetic Image Detection (SID) solutions struggle on generated images sourced from the Internet, as these are often altered by compression and other operations. To address this, our research enhances SID by exploring data augmentation combinations, leveraging a genetic algorithm for optimal augmentation selection, and introducing a dual-criteria optimization approach. These methods significantly improve model performance under real-world perturbations. Our findings provide valuable insights for developing detection models capable of identifying synthetic images across varying qualities and transformations, with the best-performing model achieving a mean average precision increase of +22.53% compared to models without augmentations. The implementation is available at github.com/efthimia145/sid-composite-data-augmentation.

* EUSIPCO 2025 (33rd European Signal Processing Conference)

Via

Access Paper or Ask Questions

Latent Multimodal Reconstruction for Misinformation Detection

Apr 08, 2025

Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Abstract:Multimodal misinformation, such as miscaptioned images, where captions misrepresent an image's origin, context, or meaning, poses a growing challenge in the digital age. To support fact-checkers, researchers have been focusing on creating datasets and developing methods for multimodal misinformation detection (MMD). Due to the scarcity of large-scale annotated MMD datasets, recent studies leverage synthetic training data via out-of-context image-caption pairs or named entity manipulations; altering names, dates, and locations. However, these approaches often produce simplistic misinformation that fails to reflect real-world complexity, limiting the robustness of detection models trained on them. Meanwhile, despite recent advancements, Large Vision-Language Models (LVLMs) remain underutilized for generating diverse, realistic synthetic training data for MMD. To address this gap, we introduce "MisCaption This!", a training dataset comprising LVLM-generated miscaptioned images. Additionally, we introduce "Latent Multimodal Reconstruction" (LAMAR), a network trained to reconstruct the embeddings of truthful captions, providing a strong auxiliary signal to the detection process. To optimize LAMAR, we explore different training strategies (end-to-end training and large-scale pre-training) and integration approaches (direct, mask, gate, and attention). Extensive experiments show that models trained on "MisCaption This!" generalize better on real-world misinformation, while LAMAR sets new state-of-the-art on both NewsCLIPpings and VERITE benchmarks; highlighting the potential of LVLM-generated data and reconstruction-based approaches for advancing MMD. We release our code at: https://github.com/stevejpapad/miscaptioned-image-reconstruction

Via

Access Paper or Ask Questions

SpikeSift: A Computationally Efficient and Drift-Resilient Spike Sorting Algorithm

Apr 02, 2025

Vasileios Georgiadis, Panagiotis C. Petrantonakis

Abstract:Spike sorting is a fundamental step in analyzing extracellular recordings, enabling the isolation of individual neuronal activity, yet it remains a challenging problem due to overlapping signals and recording instabilities, including electrode drift. While numerous algorithms have been developed to address these challenges, many struggle to balance accuracy and computational efficiency, limiting their applicability to largescale datasets. In response, we introduce SpikeSift, a novel spike sorting algorithm designed to mitigate drift by partitioning recordings into short, relatively stationary segments, with spikes subsequently sorted within each. To preserve neuronal identity across segment boundaries, a computationally efficient alignment process merges clusters without relying on continuous trajectory estimation. In contrast to conventional methods that separate spike detection from clustering, SpikeSift integrates these processes within an iterative detect-andsubtract framework, enhancing clustering accuracy while maintaining computational efficiency. Evaluations on intracellularly validated datasets and biophysically realistic MEArec simulations confirm that SpikeSift maintains high sorting accuracy even in the presence of electrode drift, providing a scalable and computationally efficient solution for large-scale extracellular recordings

* 22 pages, 6 figures, 4 tables

Via

Access Paper or Ask Questions

A Large-scale AI-generated Image Inpainting Benchmark

Feb 10, 2025

Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Abstract:Recent advances in generative models enable highly realistic image manipulations, creating an urgent need for robust forgery detection methods. Current datasets for training and evaluating these methods are limited in scale and diversity. To address this, we propose a methodology for creating high-quality inpainting datasets and apply it to create DiQuID, comprising over 95,000 inpainted images generated from 78,000 original images sourced from MS-COCO, RAISE, and OpenImages. Our methodology consists of three components: (1) Semantically Aligned Object Replacement (SAOR) that identifies suitable objects through instance segmentation and generates contextually appropriate prompts, (2) Multiple Model Image Inpainting (MMII) that employs various state-of-the-art inpainting pipelines primarily based on diffusion models to create diverse manipulations, and (3) Uncertainty-Guided Deceptiveness Assessment (UGDA) that evaluates image realism through comparative analysis with originals. The resulting dataset surpasses existing ones in diversity, aesthetic quality, and technical quality. We provide comprehensive benchmarking results using state-of-the-art forgery detection methods, demonstrating the dataset's effectiveness in evaluating and improving detection algorithms. Through a human study with 42 participants on 1,000 images, we show that while humans struggle with images classified as deceiving by our methodology, models trained on our dataset maintain high performance on these challenging cases. Code and dataset are available at https://github.com/mever-team/DiQuID.

Via

Access Paper or Ask Questions

Similarity over Factuality: Are we making progress on multimodal out-of-context misinformation detection?

Jul 18, 2024

Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Figure 1 for Similarity over Factuality: Are we making progress on multimodal out-of-context misinformation detection?

Figure 2 for Similarity over Factuality: Are we making progress on multimodal out-of-context misinformation detection?

Figure 3 for Similarity over Factuality: Are we making progress on multimodal out-of-context misinformation detection?

Figure 4 for Similarity over Factuality: Are we making progress on multimodal out-of-context misinformation detection?

Abstract:Out-of-context (OOC) misinformation poses a significant challenge in multimodal fact-checking, where images are paired with texts that misrepresent their original context to support false narratives. Recent research in evidence-based OOC detection has seen a trend towards increasingly complex architectures, incorporating Transformers, foundation models, and large language models. In this study, we introduce a simple yet robust baseline, which assesses MUltimodal SimilaritiEs (MUSE), specifically the similarity between image-text pairs and external image and text evidence. Our results demonstrate that MUSE, when used with conventional classifiers like Decision Tree, Random Forest, and Multilayer Perceptron, can compete with and even surpass the state-of-the-art on the NewsCLIPpings and VERITE datasets. Furthermore, integrating MUSE in our proposed "Attentive Intermediate Transformer Representations" (AITR) significantly improved performance, by 3.3% and 7.5% on NewsCLIPpings and VERITE, respectively. Nevertheless, the success of MUSE, relying on surface-level patterns and shortcuts, without examining factuality and logical inconsistencies, raises critical questions about how we define the task, construct datasets, collect external evidence and overall, how we assess progress in the field. We release our code at: https://github.com/stevejpapad/outcontext-misinfo-progress

Via

Access Paper or Ask Questions

Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Apr 29, 2024

Zacharias Chrysidis, Stefanos-Iordanis Papadopoulos, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Figure 1 for Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Figure 2 for Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Figure 3 for Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Figure 4 for Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking

Abstract:Automated fact-checking (AFC) is garnering increasing attention by researchers aiming to help fact-checkers combat the increasing spread of misinformation online. While many existing AFC methods incorporate external information from the Web to help examine the veracity of claims, they often overlook the importance of verifying the source and quality of collected "evidence". One overlooked challenge involves the reliance on "leaked evidence", information gathered directly from fact-checking websites and used to train AFC systems, resulting in an unrealistic setting for early misinformation detection. Similarly, the inclusion of information from unreliable sources can undermine the effectiveness of AFC systems. To address these challenges, we present a comprehensive approach to evidence verification and filtering. We create the "CREDible, Unreliable or LEaked" (CREDULE) dataset, which consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked). Additionally, we introduce the EVidence VERification Network (EVVER-Net), trained on CREDULE to detect leaked and unreliable evidence in both short and long texts. EVVER-Net can be used to filter evidence collected from the Web, thus enhancing the robustness of end-to-end AFC systems. We experiment with various language models and show that EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy, while leveraging domain credibility scores along with short or long texts, respectively. Finally, we assess the evidence provided by widely-used fact-checking datasets including LIAR-PLUS, MOCHEG, FACTIFY, NewsCLIPpings+ and VERITE, some of which exhibit concerning rates of leaked and unreliable evidence.

Via

Access Paper or Ask Questions

RED-DOT: Multimodal Fact-checking via Relevant Evidence Detection

Nov 16, 2023

Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Figure 1 for RED-DOT: Multimodal Fact-checking via Relevant Evidence Detection

Figure 2 for RED-DOT: Multimodal Fact-checking via Relevant Evidence Detection

Figure 3 for RED-DOT: Multimodal Fact-checking via Relevant Evidence Detection

Figure 4 for RED-DOT: Multimodal Fact-checking via Relevant Evidence Detection

Abstract:Online misinformation is often multimodal in nature, i.e., it is caused by misleading associations between texts and accompanying images. To support the fact-checking process, researchers have been recently developing automatic multimodal methods that gather and analyze external information, evidence, related to the image-text pairs under examination. However, prior works assumed all collected evidence to be relevant. In this study, we introduce a "Relevant Evidence Detection" (RED) module to discern whether each piece of evidence is relevant, to support or refute the claim. Specifically, we develop the "Relevant Evidence Detection Directed Transformer" (RED-DOT) and explore multiple architectural variants (e.g., single or dual-stage) and mechanisms (e.g., "guided attention"). Extensive ablation and comparative experiments demonstrate that RED-DOT achieves significant improvements over the state-of-the-art on the VERITE benchmark by up to 28.5%. Furthermore, our evidence re-ranking and element-wise modality fusion led to RED-DOT achieving competitive and even improved performance on NewsCLIPings+, without the need for numerous evidence or multiple backbone encoders. Finally, our qualitative analysis demonstrates that the proposed "guided attention" module has the potential to enhance the architecture's interpretability. We release our code at: https://github.com/stevejpapad/relevant-evidence-detection

Via

Access Paper or Ask Questions

Figments and Misalignments: A Framework for Fine-grained Crossmodal Misinformation Detection

Apr 27, 2023

Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Abstract:Multimedia content has become ubiquitous on social media platforms, leading to the rise of multimodal misinformation and the urgent need for effective strategies to detect and prevent its spread. This study focuses on CrossModal Misinformation (CMM) where image-caption pairs work together to spread falsehoods. We contrast CMM with Asymmetric Multimodal Misinformation (AMM), where one dominant modality propagates falsehoods while other modalities have little or no influence. We show that AMM adds noise to the training and evaluation process while exacerbating the unimodal bias, where text-only or image-only detectors can seemingly outperform their multimodal counterparts on an inherently multimodal task. To address this issue, we collect and curate FIGMENTS, a robust evaluation benchmark for CMM, which consists of real world cases of misinformation, excludes AMM and utilizes modality balancing to successfully alleviate unimodal bias. FIGMENTS also provides a first step towards fine-grained CMM detection by including three classes: truthful, out-of-context, and miscaptioned image-caption pairs. Furthermore, we introduce a method for generating realistic synthetic training data that maintains crossmodal relations between legitimate images and false human-written captions that we term Crossmodal HArd Synthetic MisAlignment (CHASMA). We conduct extensive comparative study using a Transformer-based architecture. Our results show that incorporating CHASMA in conjunction with other generated datasets consistently improved the overall performance on FIGMENTS in both binary (+6.26%) and multiclass settings (+15.8%).We release our code at: https://github.com/stevejpapad/figments-and-misalignments

Via

Access Paper or Ask Questions

Synthetic Misinformers: Generating and Combating Multimodal Misinformation

Mar 02, 2023

Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Panagiotis C. Petrantonakis

Figure 1 for Synthetic Misinformers: Generating and Combating Multimodal Misinformation

Figure 2 for Synthetic Misinformers: Generating and Combating Multimodal Misinformation

Figure 3 for Synthetic Misinformers: Generating and Combating Multimodal Misinformation

Figure 4 for Synthetic Misinformers: Generating and Combating Multimodal Misinformation

Abstract:With the expansion of social media and the increasing dissemination of multimedia content, the spread of misinformation has become a major concern. This necessitates effective strategies for multimodal misinformation detection (MMD) that detect whether the combination of an image and its accompanying text could mislead or misinform. Due to the data-intensive nature of deep neural networks and the labor-intensive process of manual annotation, researchers have been exploring various methods for automatically generating synthetic multimodal misinformation - which we refer to as Synthetic Misinformers - in order to train MMD models. However, limited evaluation on real-world misinformation and a lack of comparisons with other Synthetic Misinformers makes difficult to assess progress in the field. To address this, we perform a comparative study on existing and new Synthetic Misinformers that involves (1) out-of-context (OOC) image-caption pairs, (2) cross-modal named entity inconsistency (NEI) as well as (3) hybrid approaches and we evaluate them against real-world misinformation; using the COSMOS benchmark. The comparative study showed that our proposed CLIP-based Named Entity Swapping can lead to MMD models that surpass other OOC and NEI Misinformers in terms of multimodal accuracy and that hybrid approaches can lead to even higher detection accuracy. Nevertheless, after alleviating information leakage from the COSMOS evaluation protocol, low Sensitivity scores indicate that the task is significantly more challenging than previous studies suggested. Finally, our findings showed that NEI-based Synthetic Misinformers tend to suffer from a unimodal bias, where text-only MMDs can outperform multimodal ones.

Via

Access Paper or Ask Questions

Removing Noise from Extracellular Neural Recordings Using Fully Convolutional Denoising Autoencoders

Sep 18, 2021

Christodoulos Kechris, Alexandros Delitzas, Vasileios Matsoukas, Panagiotis C. Petrantonakis

Figure 1 for Removing Noise from Extracellular Neural Recordings Using Fully Convolutional Denoising Autoencoders

Figure 2 for Removing Noise from Extracellular Neural Recordings Using Fully Convolutional Denoising Autoencoders

Figure 3 for Removing Noise from Extracellular Neural Recordings Using Fully Convolutional Denoising Autoencoders

Figure 4 for Removing Noise from Extracellular Neural Recordings Using Fully Convolutional Denoising Autoencoders

Abstract:Extracellular recordings are severely contaminated by a considerable amount of noise sources, rendering the denoising process an extremely challenging task that should be tackled for efficient spike sorting. To this end, we propose an end-to-end deep learning approach to the problem, utilizing a Fully Convolutional Denoising Autoencoder, which learns to produce a clean neuronal activity signal from a noisy multichannel input. The experimental results on simulated data show that our proposed method can improve significantly the quality of noise-corrupted neural signals, outperforming widely-used wavelet denoising techniques.

* Accepted version to be published in the 43rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2021)

Via

Access Paper or Ask Questions