Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Oct 07, 2021

Dawei Liang, Yangyang Shi, Yun Wang, Nayan Singhal, Alex Xiao, Jonathan Shaw, Edison Thomaz, Ozlem Kalinli, Mike Seltzer

Figure 1 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Figure 2 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Figure 3 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Figure 4 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Share this with someone who'll enjoy it:

Abstract:Detection of common events and scenes from audio is useful for extracting and understanding human contexts in daily life. Prior studies have shown that leveraging knowledge from a relevant domain is beneficial for a target acoustic event detection (AED) process. Inspired by the observation that many human-centered acoustic events in daily life involve voice elements, this paper investigates the potential of transferring high-level voice representations extracted from a public speaker dataset to enrich an AED pipeline. Towards this end, we develop a dual-branch neural network architecture for the joint learning of voice and acoustic features during an AED process and conduct thorough empirical studies to examine the performance on the public AudioSet [1] with different types of inputs. Our main observations are that: 1) Joint learning of audio and voice inputs improves the AED performance (mean average precision) for both a CNN baseline (0.292 vs 0.134 mAP) and a TALNet [2] baseline (0.361 vs 0.351 mAP); 2) Augmenting the extra voice features is critical to maximize the model performance with dual inputs.

* Submitted to ICASSP 2022

View paper on

Share this with someone who'll enjoy it:

Title:Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Paper and Code