Picture for Hideaki Iiduka

Hideaki Iiduka

Convergence of Sharpness-Aware Minimization Algorithms using Increasing Batch Size and Decaying Learning Rate

Add code
Sep 16, 2024
Viaarxiv icon

Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent

Add code
Sep 13, 2024
Figure 1 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Figure 2 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Figure 3 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Figure 4 for Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
Viaarxiv icon

Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates

Add code
Feb 23, 2024
Viaarxiv icon

Role of Momentum in Smoothing Objective Function in Implicit Graduated Optimization

Add code
Feb 04, 2024
Figure 1 for Role of Momentum in Smoothing Objective Function in Implicit Graduated Optimization
Figure 2 for Role of Momentum in Smoothing Objective Function in Implicit Graduated Optimization
Figure 3 for Role of Momentum in Smoothing Objective Function in Implicit Graduated Optimization
Figure 4 for Role of Momentum in Smoothing Objective Function in Implicit Graduated Optimization
Viaarxiv icon

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling

Add code
Nov 29, 2023
Viaarxiv icon

Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search

Add code
Aug 03, 2023
Viaarxiv icon

Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One

Add code
Aug 21, 2022
Figure 1 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Figure 2 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Figure 3 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Figure 4 for Critical Bach Size Minimizes Stochastic First-Order Oracle Complexity of Deep Learning Optimizer using Hyperparameters Close to One
Viaarxiv icon

Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness

Add code
Jun 27, 2022
Figure 1 for Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
Figure 2 for Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
Viaarxiv icon

Conjugate Gradient Method for Generative Adversarial Networks

Add code
Mar 28, 2022
Figure 1 for Conjugate Gradient Method for Generative Adversarial Networks
Figure 2 for Conjugate Gradient Method for Generative Adversarial Networks
Figure 3 for Conjugate Gradient Method for Generative Adversarial Networks
Figure 4 for Conjugate Gradient Method for Generative Adversarial Networks
Viaarxiv icon

Using Constant Learning Rate of Two Time-Scale Update Rule for Training Generative Adversarial Networks

Add code
Jan 28, 2022
Viaarxiv icon