Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:U-vectors: Generating clusterable speaker embedding from unlabeled data

Feb 07, 2021

M. F. Mridha, Abu Quwsar Ohi, M. Ameer Ali, Muhammad Mostafa Monowar, Md. Abdul Hamid

Figure 1 for U-vectors: Generating clusterable speaker embedding from unlabeled data

Figure 2 for U-vectors: Generating clusterable speaker embedding from unlabeled data

Figure 3 for U-vectors: Generating clusterable speaker embedding from unlabeled data

Figure 4 for U-vectors: Generating clusterable speaker embedding from unlabeled data

Share this with someone who'll enjoy it:

Abstract:Speaker recognition deals with recognizing speakers by their speech. Strategies related to speaker recognition may explore speech timbre properties, accent, speech patterns and so on. Supervised speaker recognition has been dramatically investigated. However, through rigorous excavation, we have found that unsupervised speaker recognition systems mostly depend on domain adaptation policy. This paper introduces a speaker recognition strategy dealing with unlabeled data, which generates clusterable embedding vectors from small fixed-size speech frames. The unsupervised training strategy involves an assumption that a small speech segment should include a single speaker. Depending on such a belief, we construct pairwise constraints to train twin deep learning architectures with noise augmentation policies, that generate speaker embeddings. Without relying on domain adaption policy, the process unsupervisely produces clusterable speaker embeddings, and we name it unsupervised vectors (u-vectors). The evaluation is concluded in two popular speaker recognition datasets for English language, TIMIT, and LibriSpeech. Also, we include a Bengali dataset, Bengali ASR, to illustrate the diversity of the domain shifts for speaker recognition systems. Finally, we conclude that the proposed approach achieves remarkable performance using pairwise architectures.

* 16 pages, 6 figures

View paper on

Share this with someone who'll enjoy it:

Title:U-vectors: Generating clusterable speaker embedding from unlabeled data

Paper and Code