Abstract:In this work, we introduce a spatio-temporal kernel for Gaussian process (GP) regression-based sound field estimation. Notably, GPs have the attractive property that the sound field is a linear function of the measurements, allowing the field to be estimated efficiently from distributed microphone measurements. However, to ensure analytical tractability, most existing kernels for sound field estimation have been formulated in the frequency domain, formed independently for each frequency. To address the analytical intractability of spatio-temporal kernels, we here propose to instead learn the kernel directly from data by the means of deep kernel learning. Furthermore, to improve the generalization of the deep kernel, we propose a method for regularizing the learning process using the wave equation. The representational advantages of the deep kernel and the improved generalization obtained by using the wave equation regularization are illustrated using numerical simulations.
Abstract:The ability to accurately estimate room impulse responses (RIRs) is integral to many applications of spatial audio processing. Regrettably, estimating the RIR using ambient signals, such as speech or music, remains a challenging problem due to, e.g., low signal-to-noise ratios, finite sample lengths, and poor spectral excitation. Commonly, in order to improve the conditioning of the estimation problem, priors are placed on the amplitudes of the RIR. Although serving as a regularizer, this type of prior is generally not useful when only approximate knowledge of the delay structure is available, which, for example, is the case when the prior is a simulated RIR from an approximation of the room geometry. In this work, we target the delay structure itself, constructing a prior based on the concept of optimal transport. As illustrated using both simulated and measured data, the resulting method is able to beneficially incorporate information even from simple simulation models, displaying considerable robustness to perturbations in the assumed room dimensions and its temperature.