Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jihui Zhang

RIR-Former: Coordinate-Guided Transformer for Continuous Reconstruction of Room Impulse Responses

Feb 03, 2026

Shaoheng Xu, Chunyi Sun, Jihui Zhang, Prasanga N. Samarasinghe, Thushara D. Abhayapala

Abstract:Room impulse responses (RIRs) are essential for many acoustic signal processing tasks, yet measuring them densely across space is often impractical. In this work, we propose RIR-Former, a grid-free, one-step feed-forward model for RIR reconstruction. By introducing a sinusoidal encoding module into a transformer backbone, our method effectively incorporates microphone position information, enabling interpolation at arbitrary array locations. Furthermore, a segmented multi-branch decoder is designed to separately handle early reflections and late reverberation, improving reconstruction across the entire RIR. Experiments on diverse simulated acoustic environments demonstrate that RIR-Former consistently outperforms state-of-the-art baselines in terms of normalized mean square error (NMSE) and cosine distance (CD), under varying missing rates and array configurations. These results highlight the potential of our approach for practical deployment and motivate future work on scaling from randomly spaced linear arrays to complex array geometries, dynamic acoustic scenes, and real-world environments.

* Accepted to International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2026. Equal contribution: Shaoheng Xu and Chunyi Sun

Via

Access Paper or Ask Questions

GMM based multi-stage Wiener filtering for low SNR speech enhancement

Jun 19, 2022

Wageesha Manamperi, Prasanga N. Samarasinghe, Thushara D. Abhayapala, Jihui Zhang

Figure 1 for GMM based multi-stage Wiener filtering for low SNR speech enhancement

Figure 2 for GMM based multi-stage Wiener filtering for low SNR speech enhancement

Figure 3 for GMM based multi-stage Wiener filtering for low SNR speech enhancement

Abstract:This paper proposes a single-channel speech enhancement method to reduce the noise and enhance speech at low signal-to-noise ratio (SNR) levels and non-stationary noise conditions. Specifically, we focus on modeling the noise using a Gaussian mixture model (GMM) based on a multi-stage process with a parametric Wiener filter. The proposed noise model estimates a more accurate noise power spectral density (PSD), and allows for better generalization under various noise conditions compared to traditional Wiener filtering methods. Simulations show that the proposed approach can achieve better performance in terms of speech quality (PESQ) and intelligibility (STOI) at low SNR levels.

* 5 pages, 3 figures, submitted to a conference

Via

Access Paper or Ask Questions