Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hector Delgado

ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

Aug 16, 2024

Xin Wang, Hector Delgado, Hemlata Tak, Jee-weon Jung, Hye-jin Shim, Massimiliano Todisco, Ivan Kukanov, Xuechen Liu, Md Sahidullah, Tomi Kinnunen(+3 more)

Figure 1 for ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

Figure 2 for ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

Figure 3 for ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

Figure 4 for ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale

Abstract:ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech spoofing and deepfake attacks, and the design of detection solutions. Compared to previous challenges, the ASVspoof 5 database is built from crowdsourced data collected from a vastly greater number of speakers in diverse acoustic conditions. Attacks, also crowdsourced, are generated and tested using surrogate detection models, while adversarial attacks are incorporated for the first time. New metrics support the evaluation of spoofing-robust automatic speaker verification (SASV) as well as stand-alone detection solutions, i.e., countermeasures without ASV. We describe the two challenge tracks, the new database, the evaluation metrics, baselines, and the evaluation platform, and present a summary of the results. Attacks significantly compromise the baseline systems, while submissions bring substantial improvements.

* 8 pages, ASVspoof 5 Workshop (Interspeech2024 Satellite)

Via

Access Paper or Ask Questions

Benchmarking and challenges in security and privacy for voice biometrics

Sep 01, 2021

Jean-Francois Bonastre, Hector Delgado, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee, Xuechen Liu, Andreas Nautsch, Paul-Gauthier Noe, Jose Patino, Md Sahidullah(+6 more)

Figure 1 for Benchmarking and challenges in security and privacy for voice biometrics

Figure 2 for Benchmarking and challenges in security and privacy for voice biometrics

Figure 3 for Benchmarking and challenges in security and privacy for voice biometrics

Figure 4 for Benchmarking and challenges in security and privacy for voice biometrics

Abstract:For many decades, research in speech technologies has focused upon improving reliability. With this now meeting user expectations for a range of diverse applications, speech technology is today omni-present. As result, a focus on security and privacy has now come to the fore. Here, the research effort is in its relative infancy and progress calls for greater, multidisciplinary collaboration with security, privacy, legal and ethical experts among others. Such collaboration is now underway. To help catalyse the efforts, this paper provides a high-level overview of some related research. It targets the non-speech audience and describes the benchmarking methodology that has spearheaded progress in traditional research and which now drives recent security and privacy initiatives related to voice biometrics. We describe: the ASVspoof challenge relating to the development of spoofing countermeasures; the VoicePrivacy initiative which promotes research in anonymisation for privacy preservation.

* Submitted to the symposium of the ISCA Security & Privacy in Speech Communications (SPSC) special interest group

Via

Access Paper or Ask Questions

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Apr 16, 2019

Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher(+36 more)

Figure 1 for I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Figure 2 for I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Figure 3 for I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Figure 4 for I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

Abstract:The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the results and lessons learned based on the twelve sub-systems and their fusion submitted to SRE'18. It is also our intention to present a shared view on the advancements, progresses, and major paradigm shifts that we have witnessed as an SRE participant in the past decade from SRE'08 to SRE'18. In this regard, we have seen, among others, a paradigm shift from supervector representation to deep speaker embedding, and a switch of research challenge from channel compensation to domain adaptation.

* 5 pages

Via

Access Paper or Ask Questions

Introduction to Voice Presentation Attack Detection and Recent Advances

Jan 04, 2019

Md Sahidullah, Hector Delgado, Massimiliano Todisco, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Kong-Aik Lee

Figure 1 for Introduction to Voice Presentation Attack Detection and Recent Advances

Figure 2 for Introduction to Voice Presentation Attack Detection and Recent Advances

Figure 3 for Introduction to Voice Presentation Attack Detection and Recent Advances

Figure 4 for Introduction to Voice Presentation Attack Detection and Recent Advances

Abstract:Over the past few years significant progress has been made in the field of presentation attack detection (PAD) for automatic speaker recognition (ASV). This includes the development of new speech corpora, standard evaluation protocols and advancements in front-end feature extraction and back-end classifiers. The use of standard databases and evaluation protocols has enabled for the first time the meaningful benchmarking of different PAD solutions. This chapter summarises the progress, with a focus on studies completed in the last three years. The article presents a summary of findings and lessons learned from two ASVspoof challenges, the first community-led benchmarking efforts. These show that ASV PAD remains an unsolved problem and that further attention is required to develop generalised PAD solutions which have potential to detect diverse and previously unseen spoofing attacks.

* Published in Handbook of Biometric Anti-Spoofing Presentation Attack Detection (Second Edition eBook ISBN 978-3-319-92627-8), 2019
* Published as a book-chapter in Handbook of Biometric Anti-Spoofing Presentation Attack Detection (Second Edition)

Via

Access Paper or Ask Questions

t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification

Apr 25, 2018

Tomi Kinnunen, Kong Aik Lee, Hector Delgado, Nicholas Evans, Massimiliano Todisco, Md Sahidullah, Junichi Yamagishi, Douglas A. Reynolds

Figure 1 for t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification

Figure 2 for t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification

Figure 3 for t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification

Figure 4 for t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification

Abstract:The ASVspoof challenge series was born to spearhead research in anti-spoofing for automatic speaker verification (ASV). The two challenge editions in 2015 and 2017 involved the assessment of spoofing countermeasures (CMs) in isolation from ASV using an equal error rate (EER) metric. While a strategic approach to assessment at the time, it has certain shortcomings. First, the CM EER is not necessarily a reliable predictor of performance when ASV and CMs are combined. Second, the EER operating point is ill-suited to user authentication applications, e.g. telephone banking, characterised by a high target user prior but a low spoofing attack prior. We aim to migrate from CM- to ASV-centric assessment with the aid of a new tandem detection cost function (t-DCF) metric. It extends the conventional DCF used in ASV research to scenarios involving spoofing attacks. The t-DCF metric has 6 parameters: (i) false alarm and miss costs for both systems, and (ii) prior probabilities of target and spoof trials (with an implied third, nontarget prior). The study is intended to serve as a self-contained, tutorial-like presentation. We analyse with the t-DCF a selection of top-performing CM submissions to the 2015 and 2017 editions of ASVspoof, with a focus on the spoofing attack prior. Whereas there is little to choose between countermeasure systems for lower priors, system rankings derived with the EER and t-DCF show differences for higher priors. We observe some ranking changes. Findings support the adoption of the DCF-based metric into the roadmap for future ASVspoof challenges, and possibly for other biometric anti-spoofing evaluations.

* Accepted to Speaker Odyssey 2018

Via

Access Paper or Ask Questions