Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Few-Shot Audio-Visual Learning of Environment Acoustics

Jun 08, 2022

Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman

Figure 1 for Few-Shot Audio-Visual Learning of Environment Acoustics

Figure 2 for Few-Shot Audio-Visual Learning of Environment Acoustics

Figure 3 for Few-Shot Audio-Visual Learning of Environment Acoustics

Figure 4 for Few-Shot Audio-Visual Learning of Environment Acoustics

Share this with someone who'll enjoy it:

Abstract:Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener, with implications for various applications in AR, VR, and robotics. Whereas traditional methods to estimate RIRs assume dense geometry and/or sound measurements throughout the environment, we explore how to infer RIRs based on a sparse set of images and echoes observed in the space. Towards that goal, we introduce a transformer-based method that uses self-attention to build a rich acoustic context, then predicts RIRs of arbitrary query source-receiver locations through cross-attention. Additionally, we design a novel training objective that improves the match in the acoustic signature between the RIR predictions and the targets. In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs, outperforming state-of-the-art methods and--in a major departure from traditional methods--generalizing to novel environments in a few-shot manner. Project: http://vision.cs.utexas.edu/projects/fs_rir.

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Few-Shot Audio-Visual Learning of Environment Acoustics

Paper and Code