Abstract:This paper presents an examination of State Space Models (SSM) and Koopman-based deep learning methods for modelling the dynamics of both linear and non-linear stiff strings. Through experiments with datasets generated under different initial conditions and sample rates, we assess the capacity of these models to accurately model the complex behaviours observed in string dynamics. Our findings indicate that our proposed Koopman-based model performs as well as or better than other existing approaches in non-linear cases for long-sequence modelling. We inform the design of these architectures with the structure of the problems at hand. Although challenges remain in extending model predictions beyond the training horizon (i.e., extrapolation), the focus of our investigation lies in the models' ability to generalise across different initial conditions within the training time interval. This research contributes insights into the physical modelling of dynamical systems (in particular those addressing musical acoustics) by offering a comparative overview of these and previous methods and introducing innovative strategies for model improvement. Our results highlight the efficacy of these models in simulating non-linear dynamics and emphasise their wide-ranging applicability in accurately modelling dynamical systems over extended sequences.
Abstract:Deploying deep learning models on embedded devices is an arduous task: oftentimes, there exist no platform-specific instructions, and compilation times can be considerably large due to the limited computational resources available on-device. Moreover, many music-making applications demand real-time inference. Embedded hardware platforms for audio, such as Bela, offer an entry point for beginners into physical audio computing; however, the need for cross-compilation environments and low-level software development tools for deploying embedded deep learning models imposes high entry barriers on non-expert users. We present a pipeline for deploying neural networks in the Bela embedded hardware platform. In our pipeline, we include a tool to record a multichannel dataset of sensor signals. Additionally, we provide a dockerised cross-compilation environment for faster compilation. With this pipeline, we aim to provide a template for programmers and makers to prototype and experiment with neural networks for real-time embedded musical applications.
Abstract:In this work, we propose a method for the controllable synthesis of real-time contact sounds using neural resonators. Previous works have used physically inspired statistical methods and physical modelling for object materials and excitation signals. Our method incorporates differentiable second-order resonators and estimates their coefficients using a neural network that is conditioned on physical parameters. This allows for interactive dynamic control and the generation of novel sounds in an intuitive manner. We demonstrate the practical implementation of our method and explore its potential creative applications.
Abstract:We propose an analysis-by-synthesis method for fast multi-view 3D reconstruction of opaque objects with arbitrary materials and illumination. State-of-the-art methods use both neural surface representations and neural rendering. While flexible, neural surface representations are a significant bottleneck in optimization runtime. Instead, we represent surfaces as triangle meshes and build a differentiable rendering pipeline around triangle rasterization and neural shading. The renderer is used in a gradient descent optimization where both a triangle mesh and a neural shader are jointly optimized to reproduce the multi-view images. We evaluate our method on a public 3D reconstruction dataset and show that it can match the reconstruction accuracy of traditional baselines and neural approaches while surpassing them in optimization runtime. Additionally, we investigate the shader and find that it learns an interpretable representation of appearance, enabling applications such as 3D material editing.
Abstract:Physical models of rigid bodies are used for sound synthesis in applications from virtual environments to music production. Traditional methods such as modal synthesis often rely on computationally expensive numerical solvers, while recent deep learning approaches are limited by post-processing of their results. In this work we present a novel end-to-end framework for training a deep neural network to generate modal resonators for a given 2D shape and material, using a bank of differentiable IIR filters. We demonstrate our method on a dataset of synthetic objects, but train our model using an audio-domain objective, paving the way for physically-informed synthesisers to be learned directly from recordings of real-world objects.