Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ondrej Glembek

Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

Mar 19, 2022

Anna Silnova, Themos Stafylakis, Ladislav Mosner, Oldrich Plchot, Johan Rohdin, Pavel Matejka, Lukas Burget, Ondrej Glembek, Niko Brummer

Figure 1 for Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

Figure 2 for Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

Figure 3 for Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

Figure 4 for Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch

Abstract:In this paper, we analyze the behavior and performance of speaker embeddings and the back-end scoring model under domain and language mismatch. We present our findings regarding ResNet-based speaker embedding architectures and show that reduced temporal stride yields improved performance. We then consider a PLDA back-end and show how a combination of small speaker subspace, language-dependent PLDA mixture, and nuisance-attribute projection can have a drastic impact on the performance of the system. Besides, we present an efficient way of scoring and fusing class posterior logit vectors recently shown to perform well for speaker verification task. The experiments are performed using the NIST SRE 2021 setup.

* Submitted to Odyssey 2022, under review

Via

Access Paper or Ask Questions