Picture for Kwangjun Ahn

Kwangjun Ahn

General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization

Add code
Nov 11, 2024
Figure 1 for General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization
Viaarxiv icon

Learning to Achieve Goals with Belief State Transformers

Add code
Oct 30, 2024
Figure 1 for Learning to Achieve Goals with Belief State Transformers
Figure 2 for Learning to Achieve Goals with Belief State Transformers
Figure 3 for Learning to Achieve Goals with Belief State Transformers
Figure 4 for Learning to Achieve Goals with Belief State Transformers
Viaarxiv icon

Improved Sample Complexity of Imitation Learning for Barrier Model Predictive Control

Add code
Oct 01, 2024
Viaarxiv icon

Adam with model exponential moving average is effective for nonconvex optimization

Add code
May 28, 2024
Viaarxiv icon

Does SGD really happen in tiny subspaces?

Add code
May 25, 2024
Viaarxiv icon

Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise

Add code
Feb 02, 2024
Figure 1 for Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Figure 2 for Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
Viaarxiv icon

Linear attention is (maybe) all you need (to understand transformer optimization)

Add code
Oct 02, 2023
Figure 1 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 2 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 3 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 4 for Linear attention is (maybe) all you need (to understand transformer optimization)
Viaarxiv icon

A Unified Approach to Controlling Implicit Regularization via Mirror Descent

Add code
Jun 24, 2023
Figure 1 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Figure 2 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Figure 3 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Figure 4 for A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Viaarxiv icon

Smooth Model Predictive Control with Applications to Statistical Learning

Add code
Jun 02, 2023
Viaarxiv icon

Transformers learn to implement preconditioned gradient descent for in-context learning

Add code
Jun 01, 2023
Viaarxiv icon