Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

Sep 10, 2023

Harunori Kawano, Sota Shimizu

Figure 1 for An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

Figure 2 for An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

Figure 3 for An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

Figure 4 for An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

Share this with someone who'll enjoy it:

Abstract:Wav2vec2 has achieved success in applying Transformer architecture and self-supervised learning to speech recognition. Recently, these have come to be used not only for speech recognition but also for the entire speech processing. This paper introduces an effective end-to-end speaker identification model applied Transformer-based contextual model. We explored the relationship between the hyper-parameters and the performance in order to discern the structure of an effective model. Furthermore, we propose a pooling method, Temporal Gate Pooling, with powerful learning ability for speaker identification. We applied Conformer as encoder and BEST-RQ for pre-training and conducted an evaluation utilizing the speaker identification of VoxCeleb1. The proposed method has achieved an accuracy of 87.1% with 28.5M parameters, demonstrating comparable precision to wav2vec2 with 317.7M parameters. Code is available at https://github.com/HarunoriKawano/speaker-identification-with-tgp.

* 5 pages, 3 figures

View paper on

Share this with someone who'll enjoy it:

Title:An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification

Paper and Code