Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Audio Representations with MLPs

Mar 16, 2022

Mashrur M. Morshed, Ahmad Omar Ahsan, Hasan Mahmud, Md. Kamrul Hasan

Figure 1 for Learning Audio Representations with MLPs

Figure 2 for Learning Audio Representations with MLPs

Figure 3 for Learning Audio Representations with MLPs

Figure 4 for Learning Audio Representations with MLPs

Share this with someone who'll enjoy it:

Abstract:In this paper, we propose an efficient MLP-based approach for learning audio representations, namely timestamp and scene-level audio embeddings. We use an encoder consisting of sequentially stacked gated MLP blocks, which accept 2D MFCCs as inputs. In addition, we also provide a simple temporal interpolation-based algorithm for computing scene-level embeddings from timestamp embeddings. The audio representations generated by our method are evaluated across a diverse set of benchmarks at the Holistic Evaluation of Audio Representations (HEAR) challenge, hosted at the NeurIPS 2021 competition track. We achieved first place on the Speech Commands (full), Speech Commands (5 hours), and the Mridingham Tonic benchmarks. Furthermore, our approach is also the most resource-efficient among all the submitted methods, in terms of both the number of model parameters and the time required to compute embeddings.

* In submission to Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

View paper on

Share this with someone who'll enjoy it:

Title:Learning Audio Representations with MLPs

Paper and Code