Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Improving End-to-End Neural Diarization Using Conversational Summary Representations

Jun 24, 2023

Samuel J. Broughton, Lahiru Samarakoon

Share this with someone who'll enjoy it:

Abstract:Speaker diarization is a task concerned with partitioning an audio recording by speaker identity. End-to-end neural diarization with encoder-decoder based attractor calculation (EEND-EDA) aims to solve this problem by directly outputting diarization results for a flexible number of speakers. Currently, the EDA module responsible for generating speaker-wise attractors is conditioned on zero vectors providing no relevant information to the network. In this work, we extend EEND-EDA by replacing the input zero vectors to the decoder with learned conversational summary representations. The updated EDA module sequentially generates speaker-wise attractors based on utterance-level information. We propose three methods to initialize the summary vector and conduct an investigation into varying input recording lengths. On a range of publicly available test sets, our model achieves an absolute DER performance improvement of 1.90 % when compared to the baseline.

* 5 pages, 1 figure, INTERSPEECH 2023

View paper on

Share this with someone who'll enjoy it:

Title:Improving End-to-End Neural Diarization Using Conversational Summary Representations

Paper and Code