Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

Jul 03, 2024

Jingru Lin, Meng Ge, Junyi Ao, Liqun Deng, Haizhou Li

Figure 1 for SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

Figure 2 for SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

Figure 3 for SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

Share this with someone who'll enjoy it:

Abstract:It was shown that pre-trained models with self-supervised learning (SSL) techniques are effective in various downstream speech tasks. However, most such models are trained on single-speaker speech data, limiting their effectiveness in mixture speech. This motivates us to explore pre-training on mixture speech. This work presents SA-WavLM, a novel pre-trained model for mixture speech. Specifically, SA-WavLM follows an "extract-merge-predict" pipeline in which the representations of each speaker in the input mixture are first extracted individually and then merged before the final prediction. In this pipeline, SA-WavLM performs speaker-informed extractions with the consideration of the interactions between different speakers. Furthermore, a speaker shuffling strategy is proposed to enhance the robustness towards the speaker absence. Experiments show that SA-WavLM either matches or improves upon the state-of-the-art pre-trained models.

* InterSpeech 2024

View paper on

Share this with someone who'll enjoy it:

Title:SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

Paper and Code