Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kouhei Nishida

Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

May 21, 2019

Youhei Akimoto, Shinichi Shirakawa, Nozomu Yoshinari, Kento Uchida, Shota Saito, Kouhei Nishida

Figure 1 for Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

Figure 2 for Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

Figure 3 for Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

Figure 4 for Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search

Abstract:High sensitivity of neural architecture search (NAS) methods against their input such as step-size (i.e., learning rate) and search space prevents practitioners from applying them out-of-the-box to their own problems, albeit its purpose is to automate a part of tuning process. Aiming at a fast, robust, and widely-applicable NAS, we develop a generic optimization framework for NAS. We turn a coupled optimization of connection weights and neural architecture into a differentiable optimization by means of stochastic relaxation. It accepts arbitrary search space (widely-applicable) and enables to employ a gradient-based simultaneous optimization of weights and architecture (fast). We propose a stochastic natural gradient method with an adaptive step-size mechanism built upon our theoretical investigation (robust). Despite its simplicity and no problem-dependent parameter tuning, our method exhibited near state-of-the-art performances with low computational budgets both on image classification and inpainting tasks.

* Accepted to ICML 2019. Code is available at https://github.com/shirakawas/ASNG-NAS

Via

Access Paper or Ask Questions

Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network

Sep 18, 2018

Kouhei Nishida, Hernan Aguirre, Shota Saito, Shinichi Shirakawa, Youhei Akimoto

Figure 1 for Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network

Figure 2 for Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network

Figure 3 for Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network

Figure 4 for Parameterless Stochastic Natural Gradient Method for Discrete Optimization and its Application to Hyper-Parameter Optimization for Neural Network

Abstract:Black box discrete optimization (BBDO) appears in wide range of engineering tasks. Evolutionary or other BBDO approaches have been applied, aiming at automating necessary tuning of system parameters, such as hyper parameter tuning of machine learning based systems when being installed for a specific task. However, automation is often jeopardized by the need of strategy parameter tuning for BBDO algorithms. An expert with the domain knowledge must undergo time-consuming strategy parameter tuning. This paper proposes a parameterless BBDO algorithm based on information geometric optimization, a recent framework for black box optimization using stochastic natural gradient. Inspired by some theoretical implications, we develop an adaptation mechanism for strategy parameters of the stochastic natural gradient method for discrete search domains. The proposed algorithm is evaluated on commonly used test problems. It is further extended to two examples of simultaneous optimization of the hyper parameters and the connection weights of deep learning models, leading to a faster optimization than the existing approaches without any effort of parameter tuning.

Via

Access Paper or Ask Questions