Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Data Efficient Child-Adult Speaker Diarization with Simulated Conversations

Sep 13, 2024

Anfeng Xu, Tiantian Feng, Helen Tager-Flusberg, Catherine Lord, Shrikanth Narayanan

Figure 1 for Data Efficient Child-Adult Speaker Diarization with Simulated Conversations

Figure 2 for Data Efficient Child-Adult Speaker Diarization with Simulated Conversations

Figure 3 for Data Efficient Child-Adult Speaker Diarization with Simulated Conversations

Figure 4 for Data Efficient Child-Adult Speaker Diarization with Simulated Conversations

Share this with someone who'll enjoy it:

Abstract:Automating child speech analysis is crucial for applications such as neurocognitive assessments. Speaker diarization, which identifies ``who spoke when'', is an essential component of the automated analysis. However, publicly available child-adult speaker diarization solutions are scarce due to privacy concerns and a lack of annotated datasets, while manually annotating data for each scenario is both time-consuming and costly. To overcome these challenges, we propose a data-efficient solution by creating simulated child-adult conversations using AudioSet. We then train a Whisper Encoder-based model, achieving strong zero-shot performance on child-adult speaker diarization using real datasets. The model performance improves substantially when fine-tuned with only 30 minutes of real train data, with LoRA further improving the transfer learning performance. The source code and the child-adult speaker diarization model trained on simulated conversations are publicly available.

* Under review

View paper on

Share this with someone who'll enjoy it:

Title:Data Efficient Child-Adult Speaker Diarization with Simulated Conversations

Paper and Code