Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saurav Pahuja

NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention

Sep 04, 2024

Dashanka De Silva, Siqi Cai, Saurav Pahuja, Tanja Schultz, Haizhou Li

Figure 1 for NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention

Figure 2 for NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention

Figure 3 for NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention

Figure 4 for NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention

Abstract:In the study of auditory attention, it has been revealed that there exists a robust correlation between attended speech and elicited neural responses, measurable through electroencephalography (EEG). Therefore, it is possible to use the attention information available within EEG signals to guide the extraction of the target speaker in a cocktail party computationally. In this paper, we present a neuro-guided speaker extraction model, i.e. NeuroSpex, using the EEG response of the listener as the sole auxiliary reference cue to extract attended speech from monaural speech mixtures. We propose a novel EEG signal encoder that captures the attention information. Additionally, we propose a cross-attention (CA) mechanism to enhance the speech feature representations, generating a speaker extraction mask. Experimental results on a publicly available dataset demonstrate that our proposed model outperforms two baseline models across various evaluation metrics.

Via

Access Paper or Ask Questions