Abstract:To estimate the direction of arrival (DOA) of multiple speakers with methods that use prototype transfer functions, frequency-dependent spatial spectra (SPS) are usually constructed. To make the DOA estimation robust, SPS from different frequencies can be combined. According to how the SPS are combined, frequency fusion mechanisms are categorized into narrowband, broadband, or speaker-grouped, where the latter mechanism requires a speaker-wise grouping of frequencies. For a binaural hearing aid setup, in this paper we propose an interaural time difference (ITD)-based speaker-grouped frequency fusion mechanism. By exploiting the DOA dependence of ITDs, frequencies can be grouped according to a common ITD and be used for DOA estimation of the respective speaker. We apply the proposed ITD-based speaker-grouped frequency fusion mechanism for different DOA estimation methods, namely the multiple signal classification, steered response power and a recently published method based on relative transfer function (RTF) vectors. In our experiments, we compare DOA estimation with different fusion mechanisms. For all considered DOA estimation methods, the proposed ITD-based speaker-grouped frequency fusion mechanism results in a higher DOA estimation accuracy compared with the narrowband and broadband fusion mechanisms.
Abstract:In hearing aid applications, an important objective is to accurately estimate the direction of arrival (DOA) of multiple speakers in noisy and reverberant environments. Recently, we proposed a binaural DOA estimation method, where the DOAs of the speakers are estimated by selecting the directions for which the so-called Hermitian angle spectrum between the estimated relative transfer function (RTF) vector and a database of prototype anechoic RTF vectors is maximized. The RTF vector is estimated using the covariance whitening (CW) method, which requires a computationally complex generalized eigenvalue decomposition. The spatial spectrum is obtained by only considering frequencies where it is likely that one speaker dominates over the other speakers, noise and reverberation. In this contribution, we exploit the availability of an external microphone that is spatially separated from the hearing aid microphones and consider a low-complexity RTF vector estimation method that assumes a low spatial coherence between the undesired components in the external microphone and the hearing aid microphones. Using recordings of two speakers and diffuse-like babble noise in acoustic environments with mild reverberation and low signal-to-noise ratio, simulation results show that the proposed method yields a comparable DOA estimation performance as the CW method at a lower computational complexity.
Abstract:There is an emerging need for comparable data for multi-microphone processing, particularly in acoustic sensor networks. However, commonly available databases are often limited in the spatial diversity of the microphones or only allow for particular signal processing tasks. In this paper, we present a database of acoustic impulse responses and recordings for a binaural hearing aid setup, 36 spatially distributed microphones spanning a uniform grid of (5x5) m^2 and 12 source positions. This database can be used for a variety of signal processing tasks, such as (multi-microphone) noise reduction, source localization, and dereverberation, as the measurements were performed using the same setup for three different reverberation conditions (T_60\approx{310, 510, 1300} ms). The usability of the database is demonstrated for a noise reduction task using a minimum variance distortionless response beamformer based on relative transfer functions, exploiting the availability of spatially distributed microphones.
Abstract:Recently, a relative transfer function (RTF)-vector-based method has been proposed to estimate the direction of arrival (DOA) of a target speaker for a binaural hearing aid setup, assuming the availability of external microphones. This method exploits the external microphones to estimate the RTF vector corresponding to the binaural hearing aid and constructs a one-dimensional spatial spectrum by comparing the estimated RTF vector against a database of anechoic prototype RTF vectors for several directions. In this paper we assume the availability of a calibrated array of external microphones, which is characterized by a second database of anechoic prototype RTF vectors. We propose a method, where the external microphones are not only exploited to estimate the RTF vector corresponding to the binaural hearing aid but also assist in estimating the DOA of the target speaker. Based on the estimated RTF vector for all microphones and both prototype databases, a two-dimensional spatial spectrum is constructed from which the DOA is estimated. Experimental results for a reverberant environment with diffuse-like noise show that assisted DOA estimation outperforms DOA estimation where the prototype database characterizing the array of external microphones is not used.
Abstract:Recently, a method has been proposed to estimate the direction of arrival (DOA) of a single speaker by minimizing the frequency-averaged Hermitian angle between an estimated relative transfer function (RTF) vector and a database of prototype anechoic RTF vectors. In this paper, we extend this method to multi-speaker localization by introducing the frequency-averaged Hermitian angle spectrum and selecting peaks of this spatial spectrum. To construct the Hermitian angle spectrum, we consider only a subset of frequencies, where it is likely that one speaker is dominant. We compare the effectiveness of the generalized magnitude squared coherence and two coherent-to-diffuse ratio (CDR) estimators as frequency selection criteria. Simulation results for estimating the DOAs of two speakers in a reverberant environment with diffuse-like babble noise using binaural hearing devices show that using the binaural effective-coherence-based CDR estimate as a frequency selection criterion yields the best performance.
Abstract:In this paper we consider a binaural hearing aid setup, where in addition to the head-mounted microphones an external microphone is available. For this setup, we investigate the performance of several relative transfer function (RTF) vector estimation methods to estimate the direction of arrival(DOA) of the target speaker in a noisy and reverberant acoustic environment. More in particular, we consider the state-of-the-art covariance whitening (CW) and covariance subtraction (CS) methods, either incorporating the external microphone or not, and the recently proposed spatial coherence (SC) method, requiring the external microphone. To estimate the DOA from the estimated RTF vector, we propose to minimize the frequency-averaged Hermitian angle between the estimated head-mounted RTF vector and a database of prototype head-mounted RTF vectors. Experimental results with stationary and moving speech sources in a reverberant environment with diffuse-like noise show that the SC method outperforms the CS method and yields a similar DOA estimation accuracy as the CW method at a lower computational complexity.