Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Dec 16, 2021

Yicheng Hsu, Yonghan Lee, Mingsian R. Bai

Figure 1 for Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Figure 2 for Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Figure 3 for Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Figure 4 for Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Share this with someone who'll enjoy it:

Abstract:Teleconferencing is becoming essential during the COVID-19 pandemic. However, in real-world applications, speech quality can deteriorate due to, for example, background interference, noise, or reverberation. To solve this problem, target speech extraction from the mixture signals can be performed with the aid of the user's vocal features. Various features are accounted for in this study's proposed system, including speaker embeddings derived from user enrollment and a novel long-short-term spatial coherence (LSTSC) feature to the target speaker activity. As a learning-based approach, a target speech sifting network was employed to extract the target speech signal. The network trained with LSTSC in the proposed approach is robust to microphone array geometries and the number of microphones. Furthermore, the proposed enhancement system was compared with a baseline system with speaker embeddings and interchannel phase difference. The results demonstrated the superior performance of the proposed system over the baseline in enhancement performance and robustness.

* submitted to ICASSP 2022

View paper on

Share this with someone who'll enjoy it:

Title:Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Paper and Code