Abstract:In this study, we explore the potential of visibility graphs in the spectral domain for speaker recognition. Adult participants were instructed to record vocalizations of the five Spanish vowels. For each vocalization, we computed the frequency spectrum considering the source-filter model of speech production, where formants are shaped by the vocal tract acting as a passive filter with resonant frequencies. Spectral profiles exhibited consistent intra-speaker characteristics, reflecting individual vocal tract anatomies, while showing variation between speakers. We then constructed visibility graphs from these spectral profiles and extracted various graph-theoretic metrics to capture their topological features. These metrics were assembled into feature vectors representing the five vowels for each speaker. Using an ensemble of decision trees trained on these features, we achieved high accuracy in speaker identification. Our analysis identified key topological features that were critical in distinguishing between speakers. This study demonstrates the effectiveness of visibility graphs for spectral analysis and their potential in speaker recognition. We also discuss the robustness of this approach, offering insights into its applicability for real-world speaker recognition systems. This research contributes to expanding the feature extraction toolbox for speaker recognition by leveraging the topological properties of speech signals in the spectral domain.
Abstract:We present a method for reconstructing evolutionary trees from high-dimensional data, with a specific application to bird song spectrograms. We address the challenge of inferring phylogenetic relationships from phenotypic traits, like vocalizations, without predefined acoustic properties. Our approach combines two main components: Poincar\'e embeddings for dimensionality reduction and distance computation, and the neighbor joining algorithm for tree reconstruction. Unlike previous work, we employ Siamese networks to learn embeddings from only leaf node samples of the latent tree. We demonstrate our method's effectiveness on both synthetic data and spectrograms from six species of finches.