Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization

May 15, 2024

Jenthe Thienpondt, Kris Demuynck

Figure 1 for Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization

Figure 2 for Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization

Figure 3 for Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization

Figure 4 for Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization

Share this with someone who'll enjoy it:

Abstract:Current speaker diarization systems rely on an external voice activity detection model prior to speaker embedding extraction on the detected speech segments. In this paper, we establish that the attention system of a speaker embedding extractor acts as a weakly supervised internal VAD model and performs equally or better than comparable supervised VAD systems. Subsequently, speaker diarization can be performed efficiently by extracting the VAD logits and corresponding speaker embedding simultaneously, alleviating the need and computational overhead of an external VAD model. We provide an extensive analysis of the behavior of the frame-level attention system in current speaker verification models and propose a novel speaker diarization pipeline using ECAPA2 speaker embeddings for both VAD and embedding extraction. The proposed strategy gains state-of-the-art performance on the AMI, VoxConverse and DIHARD III diarization benchmarks.

* Proceedings of Odyssey 2024: The Speaker and Language Recognition Workshop

View paper on

Share this with someone who'll enjoy it:

Title:Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization

Paper and Code