Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rama Cont

Asymptotic Analysis of Deep Residual Networks

Dec 15, 2022

Rama Cont, Alain Rossier, Renyuan Xu

Abstract:We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers increases. We first show the existence of scaling regimes for trained weights markedly different from those implicitly assumed in the neural ODE literature. We study the convergence of the hidden state dynamics in these scaling regimes, showing that one may obtain an ODE, a stochastic differential equation (SDE) or neither of these. In particular, our findings point to the existence of a diffusive regime in which the deep network limit is described by a class of stochastic differential equations (SDEs). Finally, we derive the corresponding scaling limits for the backpropagation dynamics.

* 49 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:2105.12245

Via

Access Paper or Ask Questions

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Apr 14, 2022

Rama Cont, Alain Rossier, RenYuan Xu

Figure 1 for Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Figure 2 for Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Figure 3 for Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Figure 4 for Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Abstract:We prove linear convergence of gradient descent to a global minimum for the training of deep residual networks with constant layer width and smooth activation function. We further show that the trained weights, as a function of the layer index, admits a scaling limit which is H\"older continuous as the depth of the network tends to infinity. The proofs are based on non-asymptotic estimates of the loss function and of norms of the network weights along the gradient descent path. We illustrate the relevance of our theoretical results to practical settings using detailed numerical experiments on supervised learning problems.

Via

Access Paper or Ask Questions

Scaling Properties of Deep Residual Networks

Jun 10, 2021

Alain-Sam Cohen, Rama Cont, Alain Rossier, Renyuan Xu

Figure 1 for Scaling Properties of Deep Residual Networks

Figure 2 for Scaling Properties of Deep Residual Networks

Figure 3 for Scaling Properties of Deep Residual Networks

Figure 4 for Scaling Properties of Deep Residual Networks

Abstract:Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

* Published at ICML 2021

Via

Access Paper or Ask Questions

Universal features of price formation in financial markets: perspectives from Deep Learning

Mar 19, 2018

Justin Sirignano, Rama Cont

Figure 1 for Universal features of price formation in financial markets: perspectives from Deep Learning

Figure 2 for Universal features of price formation in financial markets: perspectives from Deep Learning

Figure 3 for Universal features of price formation in financial markets: perspectives from Deep Learning

Figure 4 for Universal features of price formation in financial markets: perspectives from Deep Learning

Abstract:Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price. We assess the model by testing its out-of-sample predictions for the direction of price moves given the history of price and order flow, across a wide range of stocks and time periods. The universal price formation model is shown to exhibit a remarkably stable out-of-sample prediction accuracy across time, for a wide range of stocks from different sectors. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific. The universal model --- trained on data from all stocks --- outperforms, in terms of out-of-sample prediction accuracy, asset-specific linear and nonlinear models trained on time series of any given stock, showing that the universal nature of price formation weighs in favour of pooling together financial data from various stocks, rather than designing asset- or sector-specific models as commonly done. Standard data normalizations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations is shown to improve forecasting performance, showing evidence of path-dependence in price dynamics.

Via

Access Paper or Ask Questions