Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Speaker activity driven neural speech extraction

Feb 09, 2021

Marc Delcroix, Katerina Zmolikova, Tsubasa Ochiai, Keisuke Kinoshita, Tomohiro Nakatani

Figure 1 for Speaker activity driven neural speech extraction

Figure 2 for Speaker activity driven neural speech extraction

Figure 3 for Speaker activity driven neural speech extraction

Share this with someone who'll enjoy it:

Abstract:Target speech extraction, which extracts the speech of a target speaker in a mixture given auxiliary speaker clues, has recently received increased interest. Various clues have been investigated such as pre-recorded enrollment utterances, direction information, or video of the target speaker. In this paper, we explore the use of speaker activity information as an auxiliary clue for single-channel neural network-based speech extraction. We propose a speaker activity driven speech extraction neural network (ADEnet) and show that it can achieve performance levels competitive with enrollment-based approaches, without the need for pre-recordings. We further demonstrate the potential of the proposed approach for processing meeting-like recordings, where the speaker activity is obtained from a diarization system. We show that this simple yet practical approach can successfully extract speakers after diarization, which results in improved ASR performance, especially in high overlapping conditions, with a relative word error rate reduction of up to 25%.

* To appear in ICASSP 2021

View paper on

Share this with someone who'll enjoy it:

Title:Speaker activity driven neural speech extraction

Paper and Code