Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mikael Ovaska

University of Jyväskylä

Deep Neural Network Voice Activity Detector for Downsampled Audio Data: An Experiment Report

Aug 12, 2021

Mikael Ovaska, Joni Kultanen, Teemu Autto, Joonas Uusnäkki, Antti Kariluoto, Joonas Himmanen, Mikko Virtaneva, Pasi Kaitila, Pekka Abrahamsson

Figure 1 for Deep Neural Network Voice Activity Detector for Downsampled Audio Data: An Experiment Report

Figure 2 for Deep Neural Network Voice Activity Detector for Downsampled Audio Data: An Experiment Report

Figure 3 for Deep Neural Network Voice Activity Detector for Downsampled Audio Data: An Experiment Report

Figure 4 for Deep Neural Network Voice Activity Detector for Downsampled Audio Data: An Experiment Report

Abstract:Sociometric badges are an emerging technology for study how teams interact in physical places. Audio data recorded by sociometric badges is often downsampled to not record discussions of the sociometric badges holders. To gain more information about interactions inside teams with sociometric badges a Voice Activity Detector (VAD) is deployed to measure verbal activity of the interaction. Detecting voice activity from downsampled audio data is challenging because down-sampling destroys information from the data. We developed a VAD using deep learning techniques that achieves only moderate accuracy in a low noise meeting setting and in across variable noise levels despite excellent validation performance. Experiences and lessons learned while developing the VAD are discussed.

* Pre-print. 7 Pages, 2 figures and 5 tables

Via

Access Paper or Ask Questions