Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol

May 01, 2024

Konstantinos Apostolidis, Jakob Abesser, Luca Cuccovillo, Vasileios Mezaris

Figure 1 for Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol

Figure 2 for Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol

Figure 3 for Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol

Figure 4 for Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol

Share this with someone who'll enjoy it:

Abstract:This paper presents a baseline approach and an experimental protocol for a specific content verification problem: detecting discrepancies between the audio and video modalities in multimedia content. We first design and optimize an audio-visual scene classifier, to compare with existing classification baselines that use both modalities. Then, by applying this classifier separately to the audio and the visual modality, we can detect scene-class inconsistencies between them. To facilitate further research and provide a common evaluation platform, we introduce an experimental protocol and a benchmark dataset simulating such inconsistencies. Our approach achieves state-of-the-art results in scene classification and promising outcomes in audio-visual discrepancies detection, highlighting its potential in content verification applications.

* Accepted for publication, 3rd ACM Int. Workshop on Multimedia AI against Disinformation (MAD'24) at ACM ICMR'24, June 10, 2024, Phuket, Thailand. This is the "accepted version"

View paper on

Share this with someone who'll enjoy it:

Title:Visual and audio scene classification for detecting discrepancies in video: a baseline method and experimental protocol

Paper and Code