Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jacob Benesty

Spatial-Filter-Bank-Based Neural Method for Multichannel Speech Enhancement

Apr 02, 2025

Tianqin Zheng, Jilu Jin, Hanchen Pei, Gongping Huang, Jingdong Chen, Jacob Benesty

Abstract:The performance of deep learning-based multi-channel speech enhancement methods often deteriorates when the geometric parameters of the microphone array change. Traditional approaches to mitigate this issue typically involve training on multiple microphone arrays, which can be costly. To address this challenge, we focus on uniform circular arrays and propose the use of a spatial filter bank to extract features that are approximately invariant to geometric parameters. These features are then processed by a two-stage conformer-based model (TSCBM) to enhance speech quality. Experimental results demonstrate that our proposed method can be trained on a fixed microphone array while maintaining effective performance across uniform circular arrays with unseen geometric configurations during applications.

Via

Access Paper or Ask Questions

A Unified Bayesian Perspective for Conventional and Robust Adaptive Filters

Feb 25, 2025

Leszek Szczecinski, Jacob Benesty, Eduardo Vinicius Kuhn

Abstract:In this work, we present a new perspective on the origin and interpretation of adaptive filters. By applying Bayesian principles of recursive inference from the state-space model and using a series of simplifications regarding the structure of the solution, we can present, in a unified framework, derivations of many adaptive filters which depend on the probabilistic model of the observational noise. In particular, under a Gaussian model, we obtain solutions well-known in the literature (such as LMS, NLMS, or Kalman filter), while using non-Gaussian noise, we obtain new families of adaptive filter. Notably, under assumption of Laplacian noise, we obtain a family of robust filters of which the signed-error algorithm is a well-known member, while other algorithms, derived effortlessly in the proposed framework, are entirely new. Numerical examples are shown to illustrate the properties and provide a better insight into the performance of the derived adaptive filters.

Via

Access Paper or Ask Questions

Advances in Microphone Array Processing and Multichannel Speech Enhancement

Feb 13, 2025

Gongping Huang, Jesper R. Jensen, Jingdong Chen, Jacob Benesty, Mads G. Christensen, Akihiko Sugiyama, Gary Elko, Tomas Gaensler

Abstract:This paper reviews pioneering works in microphone array processing and multichannel speech enhancement, highlighting historical achievements, technological evolution, commercialization aspects, and key challenges. It provides valuable insights into the progression and future direction of these areas. The paper examines foundational developments in microphone array design and optimization, showcasing innovations that improved sound acquisition and enhanced speech intelligibility in noisy and reverberant environments. It then introduces recent advancements and cutting-edge research in the field, particularly the integration of deep learning techniques such as all-neural beamformers. The paper also explores critical applications, discussing their evolution and current state-of-the-art technologies that significantly impact user experience. Finally, the paper outlines future research directions, identifying challenges and potential solutions that could drive further innovation in these fields. By providing a comprehensive overview and forward-looking perspective, this paper aims to inspire ongoing research and contribute to the sustained growth and development of microphone arrays and multichannel speech enhancement.

* accepted by ICASSP 2025

Via

Access Paper or Ask Questions

Independent low-rank matrix analysis based on the Sinkhorn divergence source model for blind source separation

Jan 03, 2024

Jianyu Wang, Shanzheng Guan, Jingdong Chen, Jacob Benesty

Figure 1 for Independent low-rank matrix analysis based on the Sinkhorn divergence source model for blind source separation

Figure 2 for Independent low-rank matrix analysis based on the Sinkhorn divergence source model for blind source separation

Abstract:The so-called independent low-rank matrix analysis (ILRMA) has demonstrated a great potential for dealing with the problem of determined blind source separation (BSS) for audio and speech signals. This method assumes that the spectra from different frequency bands are independent and the spectral coefficients in any frequency band are Gaussian distributed. The Itakura-Saito divergence is then employed to estimate the source model related parameters. In reality, however, the spectral coefficients from different frequency bands may be dependent, which is not considered in the existing ILRMA algorithm. This paper presents an improved version of ILRMA, which considers the dependency between the spectral coefficients from different frequency bands. The Sinkhorn divergence is then exploited to optimize the source model parameters. As a result of using the cross-band information, the BSS performance is improved. But the number of parameters to be estimated also increases significantly, and so is the computational complexity. To reduce the algorithm complexity, we apply the Kronecker product to decompose the modeling matrix into the product of a number of matrices of much smaller dimensionality. An efficient algorithm is then developed to implement the Sinkhorn divergence based BSS algorithm and the complexity is reduced by an order of magnitude.

Via

Access Paper or Ask Questions

Automatic Regularization for Linear MMSE Filters

Dec 11, 2023

Daniel Gomes de Pinho Zanco, Leszek Szczecinski, Jacob Benesty

Abstract:In this work, we consider the problem of regularization in minimum mean-squared error (MMSE) linear filters. Exploiting the relationship with statistical machine learning methods, the regularization parameter is found from the observed signals in a simple and automatic manner. The proposed approach is illustrated through system identification examples, where the automatic regularization yields near-optimal results.

Via

Access Paper or Ask Questions

An Anchor-Point Based Image-Model for Room Impulse Response Simulation with Directional Source Radiation and Sensor Directivity Patterns

Aug 21, 2023

Chao Pan, Lei Zhang, Yilong Lu, Jilu Jin, Lin Qiu, Jingdong Chen, Jacob Benesty

Abstract:The image model method has been widely used to simulate room impulse responses and the endeavor to adapt this method to different applications has also piqued great interest over the last few decades. This paper attempts to extend the image model method and develops an anchor-point-image-model (APIM) approach as a solution for simulating impulse responses by including both the source radiation and sensor directivity patterns. To determine the orientations of all the virtual sources, anchor points are introduced to real sources, which subsequently lead to the determination of the orientations of the virtual sources. An algorithm is developed to generate room impulse responses with APIM by taking into account the directional pattern functions, factional time delays, as well as the computational complexity. The developed model and algorithms can be used in various acoustic problems to simulate room acoustics and improve and evaluate processing algorithms.

* 19 pages, 8 figures

Via

Access Paper or Ask Questions

Bilinear Models for Machine Learning

Dec 06, 2019

Tayssir Doghri, Leszek Szczecinski, Jacob Benesty, Amar Mitiche

Figure 1 for Bilinear Models for Machine Learning

Figure 2 for Bilinear Models for Machine Learning

Figure 3 for Bilinear Models for Machine Learning

Figure 4 for Bilinear Models for Machine Learning

Abstract:In this work we define and analyze the bilinear models which replace the conventional linear operation used in many building blocks of machine learning (ML). The main idea is to devise the ML algorithms which are adapted to the objects they treat. In the case of monochromatic images, we show that the bilinear operation exploits better the structure of the image than the conventional linear operation which ignores the spatial relationship between the pixels. This translates into significantly smaller number of parameters required to yield the same performance. We show numerical examples of classification in the MNIST data set.

Via

Access Paper or Ask Questions