Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marco Furio Colombo

MambaFoley: Foley Sound Generation using Selective State-Space Models

Sep 13, 2024

Marco Furio Colombo, Francesca Ronchini, Luca Comanducci, Fabio Antonacci

Figure 1 for MambaFoley: Foley Sound Generation using Selective State-Space Models

Figure 2 for MambaFoley: Foley Sound Generation using Selective State-Space Models

Figure 3 for MambaFoley: Foley Sound Generation using Selective State-Space Models

Figure 4 for MambaFoley: Foley Sound Generation using Selective State-Space Models

Abstract:Recent advancements in deep learning have led to widespread use of techniques for audio content generation, notably employing Denoising Diffusion Probabilistic Models (DDPM) across various tasks. Among these, Foley Sound Synthesis is of particular interest for its role in applications for the creation of multimedia content. Given the temporal-dependent nature of sound, it is crucial to design generative models that can effectively handle the sequential modeling of audio samples. Selective State Space Models (SSMs) have recently been proposed as a valid alternative to previously proposed techniques, demonstrating competitive performance with lower computational complexity. In this paper, we introduce MambaFoley, a diffusion-based model that, to the best of our knowledge, is the first to leverage the recently proposed SSM known as Mamba for the Foley sound generation task. To evaluate the effectiveness of the proposed method, we compare it with a state-of-the-art Foley sound generative model using both objective and subjective analyses.

Via

Access Paper or Ask Questions