Picture for Yushun Zhang

Yushun Zhang

MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning

Add code
Jul 31, 2024
Viaarxiv icon

Adam-mini: Use Fewer Learning Rates To Gain More

Add code
Jun 26, 2024
Viaarxiv icon

Why Transformers Need Adam: A Hessian Perspective

Add code
Feb 26, 2024
Viaarxiv icon

Communication Efficiency Optimization of Federated Learning for Computing and Network Convergence of 6G Networks

Add code
Nov 28, 2023
Viaarxiv icon

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

Add code
Oct 17, 2023
Viaarxiv icon

Uncertainty and Explainable Analysis of Machine Learning Model for Reconstruction of Sonic Slowness Logs

Add code
Aug 24, 2023
Viaarxiv icon

Adam Can Converge Without Any Modification on Update Rules

Add code
Aug 23, 2022
Figure 1 for Adam Can Converge Without Any Modification on Update Rules
Figure 2 for Adam Can Converge Without Any Modification on Update Rules
Figure 3 for Adam Can Converge Without Any Modification on Update Rules
Figure 4 for Adam Can Converge Without Any Modification on Update Rules
Viaarxiv icon

Provable Adaptivity in Adam

Add code
Aug 21, 2022
Viaarxiv icon