Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Dec 15, 2022

Dongheon Lee, Jung-Woo Choi

Figure 1 for DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Figure 2 for DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Figure 3 for DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Figure 4 for DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Share this with someone who'll enjoy it:

Abstract:In this study, we propose a dense frequency-time attentive network (DeFT-AN) for multichannel speech enhancement. DeFT-AN is a mask estimation network that predicts a complex spectral masking pattern for suppressing the noise and reverberation embedded in the short-time Fourier transform (STFT) of an input signal. The proposed mask estimation network incorporates three different types of blocks for aggregating information in the spatial, spectral, and temporal dimensions. It utilizes a spectral transformer with a modified feed-forward network and a temporal conformer with sequential dilated convolutions. The use of dense blocks and transformers dedicated to the three different characteristics of audio signals enables more comprehensive denoising and dereverberation. The remarkable performance of DeFT-AN over state-of-the-art multichannel models is demonstrated based on two popular noisy and reverberant datasets in terms of various metrics for speech quality and intelligibility.

* 5 pages, 5 figures, 3 tables

View paper on

Share this with someone who'll enjoy it:

Title:DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement

Paper and Code