Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carmen Peláez-Moreno

Spatio-temporal Latent Representations for the Analysis of Acoustic Scenes in-the-wild

Dec 10, 2024

Claudia Montero-Ramírez, Esther Rituerto-González, Carmen Peláez-Moreno

Abstract:In the field of acoustic scene analysis, this paper presents a novel approach to find spatio-temporal latent representations from in-the-wild audio data. By using WE-LIVE, an in-house collected dataset that includes audio recordings in diverse real-world environments together with sparse GPS coordinates, self-annotated emotional and situational labels, we tackle the challenging task of associating each audio segment with its corresponding location as a pretext task, with the final aim of acoustically detecting violent (anomalous) contexts, left as further work. By generating acoustic embeddings and using the self-supervised learning paradigm, we aim to use the model-generated latent space to acoustically characterize the spatio-temporal context. We use YAMNet, an acoustic events classifier trained in AudioSet to temporally locate and identify acoustic events in WE-LIVE. In order to transform the discrete acoustic events into embeddings, we compare the information-retrieval-based TF-IDF algorithm and Node2Vec as an analogy to Natural Language Processing techniques. A VAE is then trained to provide a further adapted latent space. The analysis was carried out by measuring the cosine distance and visualizing data distribution via t-Distributed Stochastic Neighbor Embedding, revealing distinct acoustic scenes. Specifically, we discern variations between indoor and subway environments. Notably, these distinctions emerge within the latent space of the VAE, a stark contrast to the random distribution of data points before encoding. In summary, our research contributes a pioneering approach for extracting spatio-temporal latent representations from in-the-wild audio data.

* 9 pages, 6 figures

Via

Access Paper or Ask Questions

WEMAC: Women and Emotion Multi-modal Affective Computing dataset

Mar 01, 2022

Jose A. Miranda, Esther Rituerto-González, Laura Gutiérrez-Martín, Clara Luis-Mingueza, Manuel F. Canabal, Alberto Ramírez Bárcenas, Jose M. Lanza-Gutiérrez, Carmen Peláez-Moreno, Celia López-Ongil

Figure 1 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset

Figure 2 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset

Figure 3 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset

Figure 4 for WEMAC: Women and Emotion Multi-modal Affective Computing dataset

Abstract:Among the seventeen Sustainable Development Goals (SDGs) proposed within the 2030 Agenda and adopted by all the United Nations member states, the Fifth SDG is a call for action to turn Gender Equality into a fundamental human right and an essential foundation for a better world. It includes the eradication of all types of violence against women. Within this context, the UC3M4Safety research team aims to develop Bindi. This is a cyber-physical system which includes embedded Artificial Intelligence algorithms, for user real-time monitoring towards the detection of affective states, with the ultimate goal of achieving the early detection of risk situations for women. On this basis, we make use of wearable affective computing including smart sensors, data encryption for secure and accurate collection of presumed crime evidence, as well as the remote connection to protecting agents. Towards the development of such system, the recordings of different laboratory and into-the-wild datasets are in process. These are contained within the UC3M4Safety Database. Thus, this paper presents and details the first release of WEMAC, a novel multi-modal dataset, which comprises a laboratory-based experiment for 47 women volunteers that were exposed to validated audio-visual stimuli to induce real emotions by using a virtual reality headset while physiological, speech signals and self-reports were acquired and collected. We believe this dataset will serve and assist research on multi-modal affective computing using physiological and speech information.

Via

Access Paper or Ask Questions

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Oct 10, 2018

Francisco J. Valverde-Albacete, Carmen Peláez-Moreno

Figure 1 for Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Figure 2 for Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Figure 3 for Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Figure 4 for Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Abstract:Data transformation, e.g. feature transformation and selection, is an integral part of any machine learning procedure. In this paper we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transfer of information of the transformation of a discrete, multivariate source of information X into a discrete, multivariate sink of information Y related by a distribution PXY . The first contribution is a decomposition of the maximal potential entropy of (X, Y) that we call a balance equation, into its a) non-transferable, b) transferable but not transferred and c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate Channel Multivariate Entropy Triangle is a visual exploratory tool to assess the effectiveness of multivariate data transformations in transferring information from input to output variables. We also show how these decomposition and balance equation also apply to the entropies of X and Y respectively and generate entropy triangles for them. As an example, we present the application of these tools to the assessment of information transfer efficiency for PCA and ICA as unsupervised feature transformation and selection procedures in supervised classification tasks.

* Entropy 2018, 20(7), 498
* 21 pages, 7 figures and 1 table

Via

Access Paper or Ask Questions