Abstract:In many multi-microphone algorithms for noise reduction, an estimate of the relative transfer function (RTF) vector of the target speaker is required. The state-of-the-art covariance whitening (CW) method estimates the RTF vector as the principal eigenvector of the whitened noisy covariance matrix, where whitening is performed using an estimate of the noise covariance matrix. In this paper, we consider an acoustic sensor network consisting of multiple microphone nodes. Assuming uncorrelated noise between the nodes but not within the nodes, we propose two RTF vector estimation methods that leverage the block-diagonal structure of the noise covariance matrix. The first method modifies the CW method by considering only the diagonal blocks of the estimated noise covariance matrix. In contrast, the second method only considers the off-diagonal blocks of the noisy covariance matrix, but cannot be solved using a simple eigenvalue decomposition. When applying the estimated RTF vector in a minimum variance distortionless response beamformer, simulation results for real-world recordings in a reverberant environment with multiple noise sources show that the modified CW method performs slightly better than the CW method in terms of SNR improvement, while the off-diagonal selection method outperforms a biased RTF vector estimate obtained as the principal eigenvector of the noisy covariance matrix.
Abstract:There is an emerging need for comparable data for multi-microphone processing, particularly in acoustic sensor networks. However, commonly available databases are often limited in the spatial diversity of the microphones or only allow for particular signal processing tasks. In this paper, we present a database of acoustic impulse responses and recordings for a binaural hearing aid setup, 36 spatially distributed microphones spanning a uniform grid of (5x5) m^2 and 12 source positions. This database can be used for a variety of signal processing tasks, such as (multi-microphone) noise reduction, source localization, and dereverberation, as the measurements were performed using the same setup for three different reverberation conditions (T_60\approx{310, 510, 1300} ms). The usability of the database is demonstrated for a noise reduction task using a minimum variance distortionless response beamformer based on relative transfer functions, exploiting the availability of spatially distributed microphones.
Abstract:In many multi-microphone algorithms, an estimate of the relative transfer functions (RTFs) of the desired speaker is required. Recently, a computationally efficient RTF vector estimation method was proposed for acoustic sensor networks, assuming that the spatial coherence (SC) of the noise component between a local microphone array and multiple external microphones is low. Aiming at optimizing the output signal-to-noise ratio (SNR), this method linearly combines multiple RTF vector estimates, where the complex-valued weights are computed using a generalized eigenvalue decomposition (GEVD). In this paper, we perform a theoretical bias analysis for the SC-based RTF vector estimation method with multiple external microphones. Assuming a certain model for the noise field, we derive an analytical expression for the weights, showing that the optimal model-based weights are real-valued and only depend on the input SNR in the external microphones. Simulations with real-world recordings show a good accordance of the GEVD-based and the model-based weights. Nevertheless, the results also indicate that in practice, estimation errors occur which the model-based weights cannot account for.