Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gabriel Bibbó

The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection

Sep 17, 2024

Gabriel Bibbó, Thomas Deacon, Arshdeep Singh, Mark D. Plumbley

Figure 1 for The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection

Figure 2 for The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection

Figure 3 for The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection

Figure 4 for The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection

Abstract:This paper presents a residential audio dataset to support sound event detection research for smart home applications aimed at promoting wellbeing for older adults. The dataset is constructed by deploying audio recording systems in the homes of 8 participants aged 55-80 years for a 7-day period. Acoustic characteristics are documented through detailed floor plans and construction material information to enable replication of the recording environments for AI model deployment. A novel automated speech removal pipeline is developed, using pre-trained audio neural networks to detect and remove segments containing spoken voice, while preserving segments containing other sound events. The resulting dataset consists of privacy-compliant audio recordings that accurately capture the soundscapes and activities of daily living within residential spaces. The paper details the dataset creation methodology, the speech removal pipeline utilizing cascaded model architectures, and an analysis of the vocal label distribution to validate the speech removal process. This dataset enables the development and benchmarking of sound event detection models tailored specifically for in-home applications.

Via

Access Paper or Ask Questions

Integrating IP Broadcasting with Audio Tags: Workflow and Challenges

Jul 23, 2024

Rhys Burchett-Vass, Arshdeep Singh, Gabriel Bibbó, Mark D. Plumbley

Figure 1 for Integrating IP Broadcasting with Audio Tags: Workflow and Challenges

Figure 2 for Integrating IP Broadcasting with Audio Tags: Workflow and Challenges

Figure 3 for Integrating IP Broadcasting with Audio Tags: Workflow and Challenges

Figure 4 for Integrating IP Broadcasting with Audio Tags: Workflow and Challenges

Abstract:The broadcasting industry is increasingly adopting IP techniques, revolutionising both live and pre-recorded content production, from news gathering to live music events. IP broadcasting allows for the transport of audio and video signals in an easily configurable way, aligning with modern networking techniques. This shift towards an IP workflow allows for much greater flexibility, not only in routing signals but with the integration of tools using standard web development techniques. One possible tool could include the use of live audio tagging, which has a number of uses in the production of content. These include from automated closed captioning to identifying unwanted sound events within a scene. In this paper, we describe the process of containerising an audio tagging model into a microservice, a small segregated code module that can be integrated into a multitude of different network setups. The goal is to develop a modular, accessible, and flexible tool capable of seamless deployment into broadcasting workflows of all sizes, from small productions to large corporations. Challenges surrounding latency of the selected audio tagging model and its effect on the usefulness of the end product are discussed.

* Submitted to DCASE 2024 Workshop

Via

Access Paper or Ask Questions