Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Nov 25, 2021

Umangi Jain, Harish G. Ramaswamy

Figure 1 for Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Figure 2 for Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Figure 3 for Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Figure 4 for Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Share this with someone who'll enjoy it:

Abstract:Despite their massive success, training successful deep neural networks still largely relies on experimentally choosing an architecture, hyper-parameters, initialization, and training mechanism. In this work, we focus on determining the success of standard gradient descent method for training deep neural networks on a specified dataset, architecture, and initialization (DAI) combination. Through extensive systematic experiments, we show that the evolution of singular values of the matrix obtained from the hidden layers of a DNN can aid in determining the success of gradient descent technique to train a DAI, even in the absence of validation labels in the supervised learning paradigm. This phenomenon can facilitate early give-up, stopping the training of neural networks which are predicted to not generalize well, early in the training process. Our experimentation across multiple datasets, architectures, and initializations reveals that the proposed scores can more accurately predict the success of a DAI than simply relying on the validation accuracy at earlier epochs to make a judgment.

* 10 pages, 9 figures

View paper on

Share this with someone who'll enjoy it:

Title:Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Paper and Code