Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alistair Carson

Gradient-based Optimisation of Modulation Effects

Jan 08, 2026

Alistair Carson, Alec Wright, Stefan Bilbao

Abstract:Modulation effects such as phasers, flangers and chorus effects are heavily used in conjunction with the electric guitar. Machine learning based emulation of analog modulation units has been investigated in recent years, but most methods have either been limited to one class of effect or suffer from a high computational cost or latency compared to canonical digital implementations. Here, we build on previous work and present a framework for modelling flanger, chorus and phaser effects based on differentiable digital signal processing. The model is trained in the time-frequency domain, but at inference operates in the time-domain, requiring zero latency. We investigate the challenges associated with gradient-based optimisation of such effects, and show that low-frequency weighting of loss functions avoids convergence to local minima when learning delay times. We show that when trained against analog effects units, sound output from the model is in some cases perceptually indistinguishable from the reference, but challenges still remain for effects with long delay times and feedback.

* Submitted to J. Audio Eng. Soc. Dec. 2025

Via

Access Paper or Ask Questions

Anti-aliasing of neural distortion effects via model fine tuning

May 16, 2025

Alistair Carson, Alec Wright, Stefan Bilbao

Figure 1 for Anti-aliasing of neural distortion effects via model fine tuning

Figure 2 for Anti-aliasing of neural distortion effects via model fine tuning

Figure 3 for Anti-aliasing of neural distortion effects via model fine tuning

Figure 4 for Anti-aliasing of neural distortion effects via model fine tuning

Abstract:Neural networks have become ubiquitous with guitar distortion effects modelling in recent years. Despite their ability to yield perceptually convincing models, they are susceptible to frequency aliasing when driven by high frequency and high gain inputs. Nonlinear activation functions create both the desired harmonic distortion and unwanted aliasing distortion as the bandwidth of the signal is expanded beyond the Nyquist frequency. Here, we present a method for reducing aliasing in neural models via a teacher-student fine tuning approach, where the teacher is a pre-trained model with its weights frozen, and the student is a copy of this with learnable parameters. The student is fine-tuned against an aliasing-free dataset generated by passing sinusoids through the original model and removing non-harmonic components from the output spectra. Our results show that this method significantly suppresses aliasing for both long-short-term-memory networks (LSTM) and temporal convolutional networks (TCN). In the majority of our case studies, the reduction in aliasing was greater than that achieved by two times oversampling. One side-effect of the proposed method is that harmonic distortion components are also affected. This adverse effect was found to be model-dependent, with the LSTM models giving the best balance between anti-aliasing and preserving the perceived similarity to an analog reference device.

* Accepted for DAFx25

Via

Access Paper or Ask Questions

Resampling Filter Design for Multirate Neural Audio Effect Processing

Jan 30, 2025

Alistair Carson, Vesa Välimäki, Alec Wright, Stefan Bilbao

Abstract:Neural networks have become ubiquitous in audio effects modelling, especially for guitar amplifiers and distortion pedals. One limitation of such models is that the sample rate of the training data is implicitly encoded in the model weights and therefore not readily adjustable at inference. Recent work explored modifications to recurrent neural network architecture to approximate a sample rate independent system, enabling audio processing at a rate that differs from the original training rate. This method works well for integer oversampling and can reduce aliasing caused by nonlinear activation functions. For small fractional changes in sample rate, fractional delay filters can be used to approximate sample rate independence, but in some cases this method fails entirely. Here, we explore the use of signal resampling at the input and output of the neural network as an alternative solution. We investigate several resampling filter designs and show that a two-stage design consisting of a half-band IIR filter cascaded with a Kaiser window FIR filter can give similar or better results to the previously proposed model adjustment method with many fewer operations per sample and less than one millisecond of latency at typical audio rates. Furthermore, we investigate interpolation and decimation filters for the task of integer oversampling and show that cascaded half-band IIR and FIR designs can be used in conjunction with the model adjustment method to reduce aliasing in a range of distortion effect models.

* Preprint

Via

Access Paper or Ask Questions

Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models

Nov 22, 2024

Alec Wright, Alistair Carson, Lauri Juvela

Abstract:This paper introduces Open-Amp, a synthetic data framework for generating large-scale and diverse audio effects data. Audio effects are relevant to many musical audio processing and Music Information Retrieval (MIR) tasks, such as modelling of analog audio effects, automatic mixing, tone matching and transcription. Existing audio effects datasets are limited in scope, usually including relatively few audio effects processors and a limited amount of input audio signals. Our proposed framework overcomes these issues, by crowdsourcing neural network emulations of guitar amplifiers and effects, created by users of open-source audio effects emulation software. This allows users of Open-Amp complete control over the input signals to be processed by the effects models, as well as providing high-quality emulations of hundreds of devices. Open-Amp can render audio online during training, allowing great flexibility in data augmentation. Our experiments show that using Open-Amp to train a guitar effects encoder achieves new state-of-the-art results on multiple guitar effects classification tasks. Furthermore, we train a one-to-many guitar effects model using Open-Amp, and use it to emulate unseen analog effects via manipulation of its learned latent space, indicating transferability to analog guitar effects data.

Via

Access Paper or Ask Questions

Interpolation filter design for sample rate independent audio effect RNNs

Sep 24, 2024

Alistair Carson, Alec Wright, Stefan Bilbao

Figure 1 for Interpolation filter design for sample rate independent audio effect RNNs

Figure 2 for Interpolation filter design for sample rate independent audio effect RNNs

Figure 3 for Interpolation filter design for sample rate independent audio effect RNNs

Figure 4 for Interpolation filter design for sample rate independent audio effect RNNs

Abstract:Recurrent neural networks (RNNs) are effective at emulating the non-linear, stateful behavior of analog guitar amplifiers and distortion effects. Unlike the case of direct circuit simulation, RNNs have a fixed sample rate encoded in their model weights, making the sample rate non-adjustable during inference. Recent work has proposed increasing the sample rate of RNNs at inference (oversampling) by increasing the feedback delay length in samples, using a fractional delay filter for non-integer conversions. Here, we investigate the task of lowering the sample rate at inference (undersampling), and propose using an extrapolation filter to approximate the required fractional signal advance. We consider two filter design methods and analyze the impact of filter order on audio quality. Our results show that the correct choice of filter can give high quality results for both oversampling and undersampling; however, in some cases the sample rate adjustment leads to unwanted artefacts in the output signal. We analyse these failure cases through linearised stability analysis, showing that they result from instability around a fixed point. This approach enables an informed prediction of suitable interpolation filters for a given RNN model before runtime.

Via

Access Paper or Ask Questions

Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing

Jun 10, 2024

Alistair Carson, Alec Wright, Jatin Chowdhury, Vesa Välimäki, Stefan Bilbao

Figure 1 for Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing

Figure 2 for Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing

Figure 3 for Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing

Figure 4 for Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing

Abstract:In recent years, machine learning approaches to modelling guitar amplifiers and effects pedals have been widely investigated and have become standard practice in some consumer products. In particular, recurrent neural networks (RNNs) are a popular choice for modelling non-linear devices such as vacuum tube amplifiers and distortion circuitry. One limitation of such models is that they are trained on audio at a specific sample rate and therefore give unreliable results when operating at another rate. Here, we investigate several methods of modifying RNN structures to make them approximately sample rate independent, with a focus on oversampling. In the case of integer oversampling, we demonstrate that a previously proposed delay-based approach provides high fidelity sample rate conversion whilst additionally reducing aliasing. For non-integer sample rate adjustment, we propose two novel methods and show that one of these, based on cubic Lagrange interpolation of a delay-line, provides a significant improvement over existing methods. To our knowledge, this work provides the first in-depth study into this problem.

* Accepted for publication in Proc. DAFx24, Guildford, UK, September 2024

Via

Access Paper or Ask Questions

Differentiable All-pole Filters for Time-varying Audio Systems

Apr 12, 2024

Chin-Yun Yu, Christopher Mitcheltree, Alistair Carson, Stefan Bilbao, Joshua D. Reiss, György Fazekas

Figure 1 for Differentiable All-pole Filters for Time-varying Audio Systems

Figure 2 for Differentiable All-pole Filters for Time-varying Audio Systems

Figure 3 for Differentiable All-pole Filters for Time-varying Audio Systems

Figure 4 for Differentiable All-pole Filters for Time-varying Audio Systems

Abstract:Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers. However, their recursive structure impedes end-to-end training of these systems using automatic differentiation. Although non-recursive filter approximations like frequency sampling and frame-based processing have been proposed and widely used in previous works, they cannot accurately reflect the gradient of the original system. We alleviate this difficulty by re-expressing a time-varying all-pole filter to backpropagate the gradients through itself, so the filter implementation is not bound to the technical limitations of automatic differentiation frameworks. This implementation can be employed within any audio system containing filters with poles for efficient gradient evaluation. We demonstrate its training efficiency and expressive capabilities for modelling real-world dynamic audio systems on a phaser, time-varying subtractive synthesiser, and feed-forward compressor. We make our code available and provide the trained audio effect and synth models in a VST plugin at https://christhetree.github.io/all_pole_filters/.

* Submitted to DAFx 2024

Via

Access Paper or Ask Questions

Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Jun 02, 2023

Alistair Carson, Cassia Valentini-Botinhao, Simon King, Stefan Bilbao

Figure 1 for Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Figure 2 for Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Figure 3 for Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Figure 4 for Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

Abstract:Machine learning approaches to modelling analog audio effects have seen intensive investigation in recent years, particularly in the context of non-linear time-invariant effects such as guitar amplifiers. For modulation effects such as phasers, however, new challenges emerge due to the presence of the low-frequency oscillator which controls the slowly time-varying nature of the effect. Existing approaches have either required foreknowledge of this control signal, or have been non-causal in implementation. This work presents a differentiable digital signal processing approach to modelling phaser effects in which the underlying control signal and time-varying spectral response of the effect are jointly learned. The proposed model processes audio in short frames to implement a time-varying filter in the frequency domain, with a transfer function based on typical analog phaser circuit topology. We show that the model can be trained to emulate an analog reference device, while retaining interpretable and adjustable parameters. The frame duration is an important hyper-parameter of the proposed model, so an investigation was carried out into its effect on model accuracy. The optimal frame length depends on both the rate and transient decay-time of the target effect, but the frame length can be altered at inference time without a significant change in accuracy.

* Accepted for publication in Proc. DAFx23, Copenhagen, Denmark, September 2023

Via

Access Paper or Ask Questions