Picture for Hideaki Iiduka

Hideaki Iiduka

Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates

Add code
Feb 23, 2024
Figure 1 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Figure 2 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Figure 3 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Figure 4 for Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
Viaarxiv icon

Role of Momentum in Smoothing Objective Function in Implicit Graduated Optimization

Add code
Feb 04, 2024
Viaarxiv icon

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

Add code
Nov 29, 2023
Figure 1 for Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Figure 2 for Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Figure 3 for Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Figure 4 for Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Viaarxiv icon

Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search

Add code
Aug 03, 2023
Figure 1 for Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search
Figure 2 for Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search
Figure 3 for Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search
Figure 4 for Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search
Viaarxiv icon

Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One

Add code
Aug 21, 2022
Figure 1 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Figure 2 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Figure 3 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Figure 4 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Viaarxiv icon

Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness

Add code
Jun 27, 2022
Figure 1 for Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
Figure 2 for Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
Viaarxiv icon

Conjugate Gradient Method for Generative Adversarial Networks

Add code
Mar 28, 2022
Figure 1 for Conjugate Gradient Method for Generative Adversarial Networks
Figure 2 for Conjugate Gradient Method for Generative Adversarial Networks
Figure 3 for Conjugate Gradient Method for Generative Adversarial Networks
Figure 4 for Conjugate Gradient Method for Generative Adversarial Networks
Viaarxiv icon

Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks

Add code
Jan 28, 2022
Figure 1 for Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks
Figure 2 for Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks
Figure 3 for Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks
Figure 4 for Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks
Viaarxiv icon

Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization

Add code
Dec 16, 2021
Figure 1 for Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Figure 2 for Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Figure 3 for Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Figure 4 for Minimization of Stochastic First-order Oracle Complexity of Adaptive Methods for Nonconvex Optimization
Viaarxiv icon

The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size

Add code
Aug 26, 2021
Figure 1 for The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Figure 2 for The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Figure 3 for The Number of Steps Needed for Nonconvex Optimization of a Deep Learning Optimizer is a Rational Function of Batch Size
Viaarxiv icon