Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Pushing the limits of raw waveform speaker recognition

Mar 29, 2022

Jee-weon Jung, You Jin Kim, Hee-Soo Heo, Bong-Jin Lee, Youngki Kwon, Joon Son Chung

Figure 1 for Pushing the limits of raw waveform speaker recognition

Figure 2 for Pushing the limits of raw waveform speaker recognition

Figure 3 for Pushing the limits of raw waveform speaker recognition

Figure 4 for Pushing the limits of raw waveform speaker recognition

Share this with someone who'll enjoy it:

Abstract:In recent years, speaker recognition systems based on raw waveform inputs have received increasing attention. However, the performance of such systems are typically inferior to the state-of-the-art handcrafted feature-based counterparts, which demonstrate equal error rates under 1% on the popular VoxCeleb1 test set. This paper proposes a novel speaker recognition model based on raw waveform inputs. The model incorporates recent advances in machine learning and speaker verification, including the Res2Net backbone module and multi-layer feature aggregation. Our best model achieves an equal error rate of 0.89%, which is competitive with the state-of-the-art models based on handcrafted features, and outperforms the best model based on raw waveform inputs by a large margin. We also explore the application of the proposed model in the context of self-supervised learning framework. Our self-supervised model outperforms single phase-based existing works in this line of research. Finally, we show that self-supervised pre-training is effective for the semi-supervised scenario where we only have a small set of labelled training data, along with a larger set of unlabelled examples.

* submitted to INTERSPEECH 2022 as a conference paper. 5 pages, 2 figures, 5 tables

View paper on

Share this with someone who'll enjoy it:

Title:Pushing the limits of raw waveform speaker recognition

Paper and Code