Picture for Daniil Merkulov

Daniil Merkulov

Quantization of Large Language Models with an Overdetermined Basis

Add code
Apr 15, 2024
Viaarxiv icon

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers

Add code
Sep 29, 2022
Figure 1 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Figure 2 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Figure 3 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Figure 4 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Viaarxiv icon

Memory-Efficient Backpropagation through Large Linear Layers

Add code
Feb 02, 2022
Figure 1 for Memory-Efficient Backpropagation through Large Linear Layers
Figure 2 for Memory-Efficient Backpropagation through Large Linear Layers
Figure 3 for Memory-Efficient Backpropagation through Large Linear Layers
Figure 4 for Memory-Efficient Backpropagation through Large Linear Layers
Viaarxiv icon

Fast Line Search for Multi-Task Learning

Add code
Oct 02, 2021
Figure 1 for Fast Line Search for Multi-Task Learning
Figure 2 for Fast Line Search for Multi-Task Learning
Figure 3 for Fast Line Search for Multi-Task Learning
Figure 4 for Fast Line Search for Multi-Task Learning
Viaarxiv icon

Follow the bisector: a simple method for multi-objective optimization

Add code
Jul 14, 2020
Viaarxiv icon

Stochastic gradient algorithms from ODE splitting perspective

Add code
Apr 19, 2020
Figure 1 for Stochastic gradient algorithms from ODE splitting perspective
Figure 2 for Stochastic gradient algorithms from ODE splitting perspective
Figure 3 for Stochastic gradient algorithms from ODE splitting perspective
Viaarxiv icon

Empirical study of extreme overfitting points of neural networks

Add code
Jul 03, 2019
Figure 1 for Empirical study of extreme overfitting points of neural networks
Figure 2 for Empirical study of extreme overfitting points of neural networks
Figure 3 for Empirical study of extreme overfitting points of neural networks
Figure 4 for Empirical study of extreme overfitting points of neural networks
Viaarxiv icon