Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Filip Elvander

Boundary-Informed Sound Field Reconstruction

Jun 16, 2025

David Sundström, Filip Elvander, Andreas Jakobsson

Abstract:We consider the problem of reconstructing the sound field in a room using prior information of the boundary geometry, represented as a point cloud. In general, when no boundary information is available, an accurate sound field reconstruction over a large spatial region and at high frequencies requires numerous microphone measurements. On the other hand, if all geometrical and acoustical aspects of the boundaries are known, the sound field could, in theory, be simulated without any measurements. In this work, we address the intermediate case, where only partial or uncertain boundary information is available. This setting is similar to one studied in virtual reality applications, where the goal is to create a perceptually convincing audio experience. In this work, we focus on spatial sound control applications, which in contrast require an accurate sound field reconstruction. Therefore, we formulate the problem within a linear Bayesian framework, incorporating a boundary-informed prior derived from impedance boundary conditions. The formulation allows for joint optimization of the unknown hyperparameters, including the noise and signal variances and the impedance boundary conditions. Using numerical experiments, we show that incorporating the boundary-informed prior significantly enhances the reconstruction, notably even when only a few hundreds of boundary points are available or when the boundary positions are calibrated with an uncertainty up to 1 dm.

* Accepted for publication at EUSIPCO 2025

Via

Access Paper or Ask Questions

Joint Spectrogram Separation and TDOA Estimation using Optimal Transport

Mar 24, 2025

Linda Fabiani, Sebastian J. Schlecht, Isabel Haasler, Filip Elvander

Abstract:Separating sources is a common challenge in applications such as speech enhancement and telecommunications, where distinguishing between overlapping sounds helps reduce interference and improve signal quality. Additionally, in multichannel systems, correct calibration and synchronization are essential to separate and locate source signals accurately. This work introduces a method for blind source separation and estimation of the Time Difference of Arrival (TDOA) of signals in the time-frequency domain. Our proposed method effectively separates signal mixtures into their original source spectrograms while simultaneously estimating the relative delays between receivers, using Optimal Transport (OT) theory. By exploiting the structure of the OT problem, we combine the separation and delay estimation processes into a unified framework, optimizing the system through a block coordinate descent algorithm. We analyze the performance of the OT-based estimator under various noise conditions and compare it with conventional TDOA and source separation methods. Numerical simulation results demonstrate that our proposed approach can achieve a significant level of accuracy across diverse noise scenarios for physical speech signals in both TDOA and source separation tasks.

Via

Access Paper or Ask Questions

Room Impulse Response Estimation through Optimal Mass Transport Barycenters

Mar 18, 2025

Rumeshika Pallewela, Yuyang Liu, Filip Elvander

Abstract:In this work, we consider the problem of jointly estimating a set of room impulse responses (RIRs) corresponding to closely spaced microphones. The accurate estimation of RIRs is crucial in acoustic applications such as speech enhancement, noise cancellation, and auralization. However, real-world constraints such as short excitation signals, low signal-to-noise ratios, and poor spectral excitation, often render the estimation problem ill-posed. In this paper, we address these challenges by means of optimal mass transport (OMT) regularization. In particular, we propose to use an OMT barycenter, or generalized mean, as a mechanism for information sharing between the microphones. This allows us to quantify and exploit similarities in the delay-structures between the different microphones without having to impose rigid assumptions on the room acoustics. The resulting estimator is formulated in terms of the solution to a convex optimization problem which can be implemented using standard solvers. In numerical examples, we demonstrate the potential of the proposed method in addressing otherwise ill-conditioned estimation scenarios.

* Submitted to EUSCIPCO 2025

Via

Access Paper or Ask Questions

A Diffusion-Based Generative Equalizer for Music Restoration

Mar 27, 2024

Eloi Moliner, Maija Turunen, Filip Elvander, Vesa Välimäki

Figure 1 for A Diffusion-Based Generative Equalizer for Music Restoration

Figure 2 for A Diffusion-Based Generative Equalizer for Music Restoration

Figure 3 for A Diffusion-Based Generative Equalizer for Music Restoration

Figure 4 for A Diffusion-Based Generative Equalizer for Music Restoration

Abstract:This paper presents a novel approach to audio restoration, focusing on the enhancement of low-quality music recordings, and in particular historical ones. Building upon a previous algorithm called BABE, or Blind Audio Bandwidth Extension, we introduce BABE-2, which presents a series of significant improvements. This research broadens the concept of bandwidth extension to \emph{generative equalization}, a novel task that, to the best of our knowledge, has not been explicitly addressed in previous studies. BABE-2 is built around an optimization algorithm utilizing priors from diffusion models, which are trained or fine-tuned using a curated set of high-quality music tracks. The algorithm simultaneously performs two critical tasks: estimation of the filter degradation magnitude response and hallucination of the restored audio. The proposed method is objectively evaluated on historical piano recordings, showing a marked enhancement over the prior version. The method yields similarly impressive results in rejuvenating the works of renowned vocalists Enrico Caruso and Nellie Melba. This research represents an advancement in the practical restoration of historical music.

* Submitted to DAFx24. Historical music restoration examples are available at: http://research.spa.aalto.fi/publications/papers/dafx-babe2/

Via

Access Paper or Ask Questions

Multi-Source Localization and Data Association for Time-Difference of Arrival Measurements

Mar 15, 2024

Gabrielle Flood, Filip Elvander

Figure 1 for Multi-Source Localization and Data Association for Time-Difference of Arrival Measurements

Figure 2 for Multi-Source Localization and Data Association for Time-Difference of Arrival Measurements

Figure 3 for Multi-Source Localization and Data Association for Time-Difference of Arrival Measurements

Abstract:In this work, we consider the problem of localizing multiple signal sources based on time-difference of arrival (TDOA) measurements. In the blind setting, in which the source signals are not known, the localization task is challenging due to the data association problem. That is, it is not known which of the TDOA measurements correspond to the same source. Herein, we propose to perform joint localization and data association by means of an optimal transport formulation. The method operates by finding optimal groupings of TDOA measurements and associating these with candidate source locations. To allow for computationally feasible localization in three-dimensional space, an efficient set of candidate locations is constructed using a minimal multilateration solver based on minimal sets of receiver pairs. In numerical simulations, we demonstrate that the proposed method is robust both to measurement noise and TDOA detection errors. Furthermore, it is shown that the data association provided by the proposed method allows for statistically efficient estimates of the source locations.

Via

Access Paper or Ask Questions

Room Impulse Response Estimation using Optimal Transport: Simulation-Informed Inference

Mar 06, 2024

David Sundström, Anton Björkman, Andreas Jakobsson, Filip Elvander

Abstract:The ability to accurately estimate room impulse responses (RIRs) is integral to many applications of spatial audio processing. Regrettably, estimating the RIR using ambient signals, such as speech or music, remains a challenging problem due to, e.g., low signal-to-noise ratios, finite sample lengths, and poor spectral excitation. Commonly, in order to improve the conditioning of the estimation problem, priors are placed on the amplitudes of the RIR. Although serving as a regularizer, this type of prior is generally not useful when only approximate knowledge of the delay structure is available, which, for example, is the case when the prior is a simulated RIR from an approximation of the room geometry. In this work, we target the delay structure itself, constructing a prior based on the concept of optimal transport. As illustrated using both simulated and measured data, the resulting method is able to beneficially incorporate information even from simple simulation models, displaying considerable robustness to perturbations in the assumed room dimensions and its temperature.

Via

Access Paper or Ask Questions

Multi-frequency tracking via group-sparse optimal transport

Feb 29, 2024

Isabel Haasler, Filip Elvander

Abstract:In this work, we introduce an optimal transport framework for inferring power distributions over both spatial location and temporal frequency. Recently, it has been shown that optimal transport is a powerful tool for estimating spatial spectra that change smoothly over time. In this work, we consider the tracking of the spatio-temporal spectrum corresponding to a small number of moving broad-band signal sources. Typically, such tracking problems are addressed by treating the spatio-temporal power distribution in a frequency-by-frequency manner, allowing to use well-understood models for narrow-band signals. This however leads to decreased target resolution due to inefficient use of the available information. We propose an extension of the optimal transport framework that exploits information from several frequencies simultaneously by estimating a spatio-temporal distribution penalized by a group-sparsity regularizer. This approach finds a spatial spectrum that changes smoothly over time, and at each time instance has a small support that is similar across frequencies. To the best of the authors knowledge, this is the first formulation combining optimal transport and sparsity for solving inverse problems. As is shown on simulated and real data, our method can successfully track targets in scenarios where information from separate frequency bands alone is insufficient.

* 6 pages, 9 figures

Via

Access Paper or Ask Questions

Zero-Shot Blind Audio Bandwidth Extension

Jun 02, 2023

Eloi Moliner, Filip Elvander, Vesa Välimäki

Figure 1 for Zero-Shot Blind Audio Bandwidth Extension

Figure 2 for Zero-Shot Blind Audio Bandwidth Extension

Figure 3 for Zero-Shot Blind Audio Bandwidth Extension

Figure 4 for Zero-Shot Blind Audio Bandwidth Extension

Abstract:Audio bandwidth extension involves the realistic reconstruction of high-frequency spectra from bandlimited observations. In cases where the lowpass degradation is unknown, such as in restoring historical audio recordings, this becomes a blind problem. This paper introduces a novel method called BABE (Blind Audio Bandwidth Extension) that addresses the blind problem in a zero-shot setting, leveraging the generative priors of a pre-trained unconditional diffusion model. During the inference process, BABE utilizes a generalized version of diffusion posterior sampling, where the degradation operator is unknown but parametrized and inferred iteratively. The performance of the proposed method is evaluated using objective and subjective metrics, and the results show that BABE surpasses state-of-the-art blind bandwidth extension baselines and achieves competitive performance compared to non-blind filter-informed methods when tested with synthetic data. Moreover, BABE exhibits robust generalization capabilities when enhancing real historical recordings, effectively reconstructing the missing high-frequency content while maintaining coherence with the original recording. Subjective preference tests confirm that BABE significantly improves the audio quality of historical music recordings. Examples of historical recordings restored with the proposed method are available on the companion webpage: (http://research.spa.aalto.fi/publications/papers/ieee-taslp-babe/)

* Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing

Via

Access Paper or Ask Questions

Distributed Adaptive Norm Estimation for Blind System Identification in Wireless Sensor Networks

Mar 01, 2023

Matthias Blochberger, Filip Elvander, Randall Ali, Jan Østergaard, Jesper Jensen, Marc Moonen, Toon van Waterschoot

Abstract:Distributed signal-processing algorithms in (wireless) sensor networks often aim to decentralize processing tasks to reduce communication cost and computational complexity or avoid reliance on a single device (i.e., fusion center) for processing. In this contribution, we extend a distributed adaptive algorithm for blind system identification that relies on the estimation of a stacked network-wide consensus vector at each node, the computation of which requires either broadcasting or relaying of node-specific values (i.e., local vector norms) to all other nodes. The extended algorithm employs a distributed-averaging-based scheme to estimate the network-wide consensus norm value by only using the local vector norm provided by neighboring sensor nodes. We introduce an adaptive mixing factor between instantaneous and recursive estimates of these norms for adaptivity in a time-varying system. Simulation results show that the extension provides estimation results close to the optimal fully-connected-network or broadcasting case while reducing inter-node transmission significantly.

* Accepted to ICASSP 2023

Via

Access Paper or Ask Questions

An efficient solver for designing optimal sampling schemes

Nov 10, 2021

Filip Elvander, Johan Swärd, Andreas Jakobsson

Abstract:In this short paper, we describe an efficient numerical solver for the optimal sampling problem considered in "Designing Sampling Schemes for Multi-Dimensional Data". An implementation may be found on https://www.maths.lu.se/staff/andreas-jakobsson/publications/.

Via

Access Paper or Ask Questions