Abstract:Learning rate is a crucial parameter in training of neural networks. A properly tuned learning rate leads to faster training and higher test accuracy. In this paper, we propose a Lipschitz bandit-driven approach for tuning the learning rate of neural networks. The proposed approach is compared with the popular HyperOpt technique used extensively for hyperparameter optimization and the recently developed bandit-based algorithm BLiE. The results for multiple neural network architectures indicate that our method finds a better learning rate using a) fewer evaluations and b) lesser number of epochs per evaluation, when compared to both HyperOpt and BLiE. Thus, the proposed approach enables more efficient training of neural networks, leading to lower training time and lesser computational cost.
Abstract:Motivated by multiple applications in social networks, nervous systems, and financial risk analysis, we consider the problem of learning the underlying (directed) influence graph or causal graph of a high-dimensional multivariate discrete-time Markov process with memory. At any discrete time instant, each observed variable of the multivariate process is a binary string of random length, which is parameterized by an unobservable or hidden [0,1]-valued scalar. The hidden scalars corresponding to the variables evolve according to discrete-time linear stochastic dynamics dictated by the underlying influence graph whose nodes are the variables. We extend an existing algorithm for learning i.i.d. graphical models to this Markovian setting with memory and prove that it can learn the influence graph based on the binary observations using logarithmic (in number of variables or nodes) samples when the degree of the influence graph is bounded. The crucial analytical contribution of this work is the derivation of the sample complexity result by upper and lower bounding the rate of convergence of the observed Markov process with memory to its stationary distribution in terms of the parameters of the influence graph.