Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lukas Gonon

Random Feature Representation Boosting

Jan 30, 2025

Nikita Zozoulenko, Thomas Cass, Lukas Gonon

Figure 1 for Random Feature Representation Boosting

Figure 2 for Random Feature Representation Boosting

Figure 3 for Random Feature Representation Boosting

Figure 4 for Random Feature Representation Boosting

Abstract:We introduce Random Feature Representation Boosting (RFRBoost), a novel method for constructing deep residual random feature neural networks (RFNNs) using boosting theory. RFRBoost uses random features at each layer to learn the functional gradient of the network representation, enhancing performance while preserving the convex optimization benefits of RFNNs. In the case of MSE loss, we obtain closed-form solutions to greedy layer-wise boosting with random features. For general loss functions, we show that fitting random feature residual blocks reduces to solving a quadratically constrained least squares problem. We demonstrate, through numerical experiments on 91 tabular datasets for regression and classification, that RFRBoost significantly outperforms traditional RFNNs and end-to-end trained MLP ResNets, while offering substantial computational advantages and theoretical guarantees stemming from boosting theory.

Via

Access Paper or Ask Questions

Fast Deep Hedging with Second-Order Optimization

Oct 29, 2024

Konrad Mueller, Amira Akkari, Lukas Gonon, Ben Wood

Abstract:Hedging exotic options in presence of market frictions is an important risk management task. Deep hedging can solve such hedging problems by training neural network policies in realistic simulated markets. Training these neural networks may be delicate and suffer from slow convergence, particularly for options with long maturities and complex sensitivities to market parameters. To address this, we propose a second-order optimization scheme for deep hedging. We leverage pathwise differentiability to construct a curvature matrix, which we approximate as block-diagonal and Kronecker-factored to efficiently precondition gradients. We evaluate our method on a challenging and practically important problem: hedging a cliquet option on a stock with stochastic volatility by trading in the spot and vanilla options. We find that our second-order scheme can optimize the policy in 1/4 of the number of steps that standard adaptive moment-based optimization takes.

Via

Access Paper or Ask Questions

An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning

Aug 23, 2024

Lukas Gonon, Arnulf Jentzen, Benno Kuckuck, Siyu Liang, Adrian Riekert, Philippe von Wurstemberger

Figure 1 for An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning

Figure 2 for An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning

Figure 3 for An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning

Figure 4 for An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning

Abstract:The approximation of solutions of partial differential equations (PDEs) with numerical algorithms is a central topic in applied mathematics. For many decades, various types of methods for this purpose have been developed and extensively studied. One class of methods which has received a lot of attention in recent years are machine learning-based methods, which typically involve the training of artificial neural networks (ANNs) by means of stochastic gradient descent type optimization methods. While approximation methods for PDEs using ANNs have first been proposed in the 1990s they have only gained wide popularity in the last decade with the rise of deep learning. This article aims to provide an introduction to some of these methods and the mathematical theory on which they are based. We discuss methods such as physics-informed neural networks (PINNs) and deep BSDE methods and consider several operator learning approaches.

Via

Access Paper or Ask Questions

Variance Norms for Kernelized Anomaly Detection

Jul 16, 2024

Thomas Cass, Lukas Gonon, Nikita Zozoulenko

Figure 1 for Variance Norms for Kernelized Anomaly Detection

Figure 2 for Variance Norms for Kernelized Anomaly Detection

Figure 3 for Variance Norms for Kernelized Anomaly Detection

Figure 4 for Variance Norms for Kernelized Anomaly Detection

Abstract:We present a unified theory for Mahalanobis-type anomaly detection on Banach spaces, using ideas from Cameron-Martin theory applied to non-Gaussian measures. This approach leads to a basis-free, data-driven notion of anomaly distance through the so-called variance norm of a probability measure, which can be consistently estimated using empirical measures. Our framework generalizes the classical $\mathbb{R}^d$, functional $(L^2[0,1])^d$, and kernelized settings, including the general case of non-injective covariance operator. We prove that the variance norm depends solely on the inner product in a given Hilbert space, and hence that the kernelized Mahalanobis distance can naturally be recovered by working on reproducing kernel Hilbert spaces. Using the variance norm, we introduce the notion of a kernelized nearest-neighbour Mahalanobis distance for semi-supervised anomaly detection. In an empirical study on 12 real-world datasets, we demonstrate that the kernelized nearest-neighbour Mahalanobis distance outperforms the traditional kernelized Mahalanobis distance for multivariate time series anomaly detection, using state-of-the-art time series kernels such as the signature, global alignment, and Volterra reservoir kernels. Moreover, we provide an initial theoretical justification of nearest-neighbour Mahalanobis distances by developing concentration inequalities in the finite-dimensional Gaussian case.

Via

Access Paper or Ask Questions

Universal randomised signatures for generative time series modelling

Jun 14, 2024

Francesca Biagini, Lukas Gonon, Niklas Walter

Figure 1 for Universal randomised signatures for generative time series modelling

Figure 2 for Universal randomised signatures for generative time series modelling

Figure 3 for Universal randomised signatures for generative time series modelling

Figure 4 for Universal randomised signatures for generative time series modelling

Abstract:Randomised signature has been proposed as a flexible and easily implementable alternative to the well-established path signature. In this article, we employ randomised signature to introduce a generative model for financial time series data in the spirit of reservoir computing. Specifically, we propose a novel Wasserstein-type distance based on discrete-time randomised signatures. This metric on the space of probability measures captures the distance between (conditional) distributions. Its use is justified by our novel universal approximation results for randomised signatures on the space of continuous functions taking the underlying path as an input. We then use our metric as the loss function in a non-adversarial generator model for synthetic time series data based on a reservoir neural stochastic differential equation. We compare the results of our model to benchmarks from the existing literature.

* 33 pages

Via

Access Paper or Ask Questions

Universal Approximation Theorem and error bounds for quantum neural networks and quantum reservoirs

Jul 24, 2023

Lukas Gonon, Antoine Jacquier

Abstract:Universal approximation theorems are the foundations of classical neural networks, providing theoretical guarantees that the latter are able to approximate maps of interest. Recent results have shown that this can also be achieved in a quantum setting, whereby classical functions can be approximated by parameterised quantum circuits. We provide here precise error bounds for specific classes of functions and extend these results to the interesting new setup of randomised quantum circuits, mimicking classical reservoir neural networks. Our results show in particular that a quantum neural network with $\mathcal{O}(\varepsilon^{-2})$ weights and $\mathcal{O} (\lceil \log_2(\varepsilon^{-1}) \rceil)$ qubits suffices to achieve accuracy $\varepsilon>0$ when approximating functions with integrable Fourier transform.

* 20 pages, 0 figure

Via

Access Paper or Ask Questions

Infinite-dimensional reservoir computing

Apr 02, 2023

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract:Reservoir computing approximation and generalization bounds are proved for a new concept class of input/output systems that extends the so-called generalized Barron functionals to a dynamic context. This new class is characterized by the readouts with a certain integral representation built on infinite-dimensional state-space systems. It is shown that this class is very rich and possesses useful features and universal approximation properties. The reservoir architectures used for the approximation and estimation of elements in the new class are randomly generated echo state networks with either linear or ReLU activation functions. Their readouts are built using randomly generated neural networks in which only the output layer is trained (extreme learning machines or random feature neural networks). The results in the paper yield a fully implementable recurrent neural network-based learning algorithm with provable convergence guarantees that do not suffer from the curse of dimensionality.

Via

Access Paper or Ask Questions

The necessity of depth for artificial neural networks to approximate certain classes of smooth and bounded functions without the curse of dimensionality

Jan 19, 2023

Lukas Gonon, Robin Graeber, Arnulf Jentzen

Abstract:In this article we study high-dimensional approximation capacities of shallow and deep artificial neural networks (ANNs) with the rectified linear unit (ReLU) activation. In particular, it is a key contribution of this work to reveal that for all $a,b\in\mathbb{R}$ with $b-a\geq 7$ we have that the functions $[a,b]^d\ni x=(x_1,\dots,x_d)\mapsto\prod_{i=1}^d x_i\in\mathbb{R}$ for $d\in\mathbb{N}$ as well as the functions $[a,b]^d\ni x =(x_1,\dots, x_d)\mapsto\sin(\prod_{i=1}^d x_i) \in \mathbb{R} $ for $ d \in \mathbb{N} $ can neither be approximated without the curse of dimensionality by means of shallow ANNs nor insufficiently deep ANNs with ReLU activation but can be approximated without the curse of dimensionality by sufficiently deep ANNs with ReLU activation. We show that the product functions and the sine of the product functions are polynomially tractable approximation problems among the approximating class of deep ReLU ANNs with the number of hidden layers being allowed to grow in the dimension $ d \in \mathbb{N} $. We establish the above outlined statements not only for the product functions and the sine of the product functions but also for other classes of target functions, in particular, for classes of uniformly globally bounded $ C^{ \infty } $-functions with compact support on any $[a,b]^d$ with $a\in\mathbb{R}$, $b\in(a,\infty)$. Roughly speaking, in this work we lay open that simple approximation problems such as approximating the sine or cosine of products cannot be solved in standard implementation frameworks by shallow or insufficiently deep ANNs with ReLU activation in polynomial time, but can be approximated by sufficiently deep ReLU ANNs with the number of parameters growing at most polynomially.

* 101 pages, 1 figure. arXiv admin note: substantial text overlap with arXiv:2112.14523

Via

Access Paper or Ask Questions

Reservoir kernels and Volterra series

Dec 30, 2022

Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract:A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter. It is hence called the Volterra reservoir kernel. Even though the state-space representation and the corresponding reservoir feature map are defined on an infinite-dimensional tensor algebra space, the kernel map is characterized by explicit recursions that are readily computable for specific data sets when employed in estimation problems using the representer theorem. We showcase the performance of the Volterra reservoir kernel in a popular data science application in relation to bitcoin price prediction.

* 10 pages, 2 figures, 1 table

Via

Access Paper or Ask Questions

Deep neural network expressivity for optimal stopping problems

Oct 19, 2022

Lukas Gonon

Abstract:This article studies deep neural network expression rates for optimal stopping problems of discrete-time Markov processes on high-dimensional state spaces. A general framework is established in which the value function and continuation value of an optimal stopping problem can be approximated with error at most $\varepsilon$ by a deep ReLU neural network of size at most $\kappa d^{\mathfrak{q}} \varepsilon^{-\mathfrak{r}}$. The constants $\kappa,\mathfrak{q},\mathfrak{r} \geq 0$ do not depend on the dimension $d$ of the state space or the approximation accuracy $\varepsilon$. This proves that deep neural networks do not suffer from the curse of dimensionality when employed to solve optimal stopping problems. The framework covers, for example, exponential L\'evy models, discrete diffusion processes and their running minima and maxima. These results mathematically justify the use of deep neural networks for numerically solving optimal stopping problems and pricing American options in high dimensions.

Via

Access Paper or Ask Questions