Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Abdulazeez AlAli

An RFP dataset for Real, Fake, and Partially fake audio detection

Apr 26, 2024

Abdulazeez AlAli, George Theodorakopoulos

Figure 1 for An RFP dataset for Real, Fake, and Partially fake audio detection

Figure 2 for An RFP dataset for Real, Fake, and Partially fake audio detection

Figure 3 for An RFP dataset for Real, Fake, and Partially fake audio detection

Figure 4 for An RFP dataset for Real, Fake, and Partially fake audio detection

Abstract:Recent advances in deep learning have enabled the creation of natural-sounding synthesised speech. However, attackers have also utilised these tech-nologies to conduct attacks such as phishing. Numerous public datasets have been created to facilitate the development of effective detection models. How-ever, available datasets contain only entirely fake audio; therefore, detection models may miss attacks that replace a short section of the real audio with fake audio. In recognition of this problem, the current paper presents the RFP da-taset, which comprises five distinct audio types: partial fake (PF), audio with noise, voice conversion (VC), text-to-speech (TTS), and real. The data are then used to evaluate several detection models, revealing that the available detec-tion models incur a markedly higher equal error rate (EER) when detecting PF audio instead of entirely fake audio. The lowest EER recorded was 25.42%. Therefore, we believe that creators of detection models must seriously consid-er using datasets like RFP that include PF and other types of fake audio.

Via

Access Paper or Ask Questions