Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Evelyn Herberg

Sensitivity-Based Layer Insertion for Residual and Feedforward Neural Networks

Nov 27, 2023

Evelyn Herberg, Roland Herzog, Frederik Köhne, Leonie Kreis, Anton Schiela

Abstract:The training of neural networks requires tedious and often manual tuning of the network architecture. We propose a systematic method to insert new layers during the training process, which eliminates the need to choose a fixed network size before training. Our technique borrows techniques from constrained optimization and is based on first-order sensitivity information of the objective with respect to the virtual parameters that additional layers, if inserted, would offer. We consider fully connected feedforward networks with selected activation functions as well as residual neural networks. In numerical experiments, the proposed sensitivity-based layer insertion technique exhibits improved training decay, compared to not inserting the layer. Furthermore, the computational effort is reduced in comparison to inserting the layer from the beginning. The code is available at \url{https://github.com/LeonieKreis/layer_insertion_sensitivity_based}.

Via

Access Paper or Ask Questions

Time Regularization in Optimal Time Variable Learning

Jun 28, 2023

Evelyn Herberg, Roland Herzog, Frederik Köhne

Abstract:Recently, optimal time variable learning in deep neural networks (DNNs) was introduced in arXiv:2204.08528. In this manuscript we extend the concept by introducing a regularization term that directly relates to the time horizon in discrete dynamical systems. Furthermore, we propose an adaptive pruning approach for Residual Neural Networks (ResNets), which reduces network complexity without compromising expressiveness, while simultaneously decreasing training time. The results are illustrated by applying the proposed concepts to classification tasks on the well known MNIST and Fashion MNIST data sets. Our PyTorch code is available on https://github.com/frederikkoehne/time_variable_learning.

Via

Access Paper or Ask Questions

Lecture Notes: Neural Network Architectures

Apr 18, 2023

Evelyn Herberg

Figure 1 for Lecture Notes: Neural Network Architectures

Figure 2 for Lecture Notes: Neural Network Architectures

Figure 3 for Lecture Notes: Neural Network Architectures

Figure 4 for Lecture Notes: Neural Network Architectures

Abstract:These lecture notes provide an overview of Neural Network architectures from a mathematical point of view. Especially, Machine Learning with Neural Networks is seen as an optimization problem. Covered are an introduction to Neural Networks and the following architectures: Feedforward Neural Network, Convolutional Neural Network, ResNet, and Recurrent Neural Network.

* added more references

Via

Access Paper or Ask Questions

An Optimal Time Variable Learning Framework for Deep Neural Networks

Apr 18, 2022

Harbir Antil, Hugo Díaz, Evelyn Herberg

Figure 1 for An Optimal Time Variable Learning Framework for Deep Neural Networks

Figure 2 for An Optimal Time Variable Learning Framework for Deep Neural Networks

Figure 3 for An Optimal Time Variable Learning Framework for Deep Neural Networks

Figure 4 for An Optimal Time Variable Learning Framework for Deep Neural Networks

Abstract:Feature propagation in Deep Neural Networks (DNNs) can be associated to nonlinear discrete dynamical systems. The novelty, in this paper, lies in letting the discretization parameter (time step-size) vary from layer to layer, which needs to be learned, in an optimization framework. The proposed framework can be applied to any of the existing networks such as ResNet, DenseNet or Fractional-DNN. This framework is shown to help overcome the vanishing and exploding gradient issues. Stability of some of the existing continuous DNNs such as Fractional-DNN is also studied. The proposed approach is applied to an ill-posed 3D-Maxwell's equation.

Via

Access Paper or Ask Questions