Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Nov 18, 2021

Tassadaq Hussain, Mandar Gogate, Kia Dashtipour, Amir Hussain

Figure 1 for Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Figure 2 for Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Figure 3 for Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Figure 4 for Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Share this with someone who'll enjoy it:

Abstract:Existing deep learning (DL) based speech enhancement approaches are generally optimised to minimise the distance between clean and enhanced speech features. These often result in improved speech quality however they suffer from a lack of generalisation and may not deliver the required speech intelligibility in real noisy situations. In an attempt to address these challenges, researchers have explored intelligibility-oriented (I-O) loss functions and integration of audio-visual (AV) information for more robust speech enhancement (SE). In this paper, we introduce DL based I-O SE algorithms exploiting AV information, which is a novel and previously unexplored research direction. Specifically, we present a fully convolutional AV SE model that uses a modified short-time objective intelligibility (STOI) metric as a training cost function. To the best of our knowledge, this is the first work that exploits the integration of AV modalities with an I-O based loss function for SE. Comparative experimental results demonstrate that our proposed I-O AV SE framework outperforms audio-only (AO) and AV models trained with conventional distance-based loss functions, in terms of standard objective evaluation measures when dealing with unseen speakers and noises.

* 6 pages, 4 figures

View paper on

Share this with someone who'll enjoy it:

Title:Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

Paper and Code