Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:ATGNN: Audio Tagging Graph Neural Network

Nov 02, 2023

Shubhr Singh, Christian J. Steinmetz, Emmanouil Benetos, Huy Phan, Dan Stowell

Figure 1 for ATGNN: Audio Tagging Graph Neural Network

Figure 2 for ATGNN: Audio Tagging Graph Neural Network

Figure 3 for ATGNN: Audio Tagging Graph Neural Network

Figure 4 for ATGNN: Audio Tagging Graph Neural Network

Share this with someone who'll enjoy it:

Abstract:Deep learning models such as CNNs and Transformers have achieved impressive performance for end-to-end audio tagging. Recent works have shown that despite stacking multiple layers, the receptive field of CNNs remains severely limited. Transformers on the other hand are able to map global context through self-attention, but treat the spectrogram as a sequence of patches which is not flexible enough to capture irregular audio objects. In this work, we treat the spectrogram in a more flexible way by considering it as graph structure and process it with a novel graph neural architecture called ATGNN. ATGNN not only combines the capability of CNNs with the global information sharing ability of Graph Neural Networks, but also maps semantic relationships between learnable class embeddings and corresponding spectrogram regions. We evaluate ATGNN on two audio tagging tasks, where it achieves 0.585 mAP on the FSD50K dataset and 0.335 mAP on the AudioSet-balanced dataset, achieving comparable results to Transformer based models with significantly lower number of learnable parameters.

View paper on

Share this with someone who'll enjoy it:

Title:ATGNN: Audio Tagging Graph Neural Network

Paper and Code