Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jason Cramer

SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

Sep 11, 2020

Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon(+2 more)

Figure 1 for SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

Figure 2 for SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

Figure 3 for SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

Figure 4 for SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

Abstract:We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC-UST-V2 consists of 18510 audio recordings from the "Sounds of New York City" (SONYC) acoustic sensor network, including the timestamp of audio acquisition and location of the sensor. The dataset contains annotations by volunteers from the Zooniverse citizen science platform, as well as a two-stage verification with our team. In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags. We report the results of a simple baseline model that exploits spatiotemporal information.

Via

Access Paper or Ask Questions

Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization

Nov 01, 2019

Vincent Lostanlen, Kaitlin Palmer, Elly Knight, Christopher Clark, Holger Klinck, Andrew Farnsworth, Tina Wong, Jason Cramer, Juan Pablo Bello

Figure 1 for Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization

Figure 2 for Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization

Figure 3 for Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization

Abstract:This paper proposes to perform unsupervised detection of bioacoustic events by pooling the magnitudes of spectrogram frames after per-channel energy normalization (PCEN). Although PCEN was originally developed for speech recognition, it also has beneficial effects in enhancing animal vocalizations, despite the presence of atmospheric absorption and intermittent noise. We prove that PCEN generalizes logarithm-based spectral flux, yet with a tunable time scale for background noise estimation. In comparison with pointwise logarithm, PCEN reduces false alarm rate by 50x in the near field and 5x in the far field, both on avian and marine bioacoustic datasets. Such improvements come at moderate computational cost and require no human intervention, thus heralding a promising future for PCEN in bioacoustics.

* 5 pages, 3 figures. Presented at the 3rd International Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE). 25--26 October 2019, New York, NY, USA

Via

Access Paper or Ask Questions