Picture for Daniel Bershatsky

Daniel Bershatsky

LoTR: Low Tensor Rank Weight Adaptation

Add code
Feb 05, 2024
Viaarxiv icon

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers

Add code
Sep 29, 2022
Figure 1 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Figure 2 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Figure 3 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Figure 4 for NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers
Viaarxiv icon

Survey on Large Scale Neural Network Training

Add code
Feb 21, 2022
Figure 1 for Survey on Large Scale Neural Network Training
Figure 2 for Survey on Large Scale Neural Network Training
Figure 3 for Survey on Large Scale Neural Network Training
Figure 4 for Survey on Large Scale Neural Network Training
Viaarxiv icon

Memory-Efficient Backpropagation through Large Linear Layers

Add code
Feb 02, 2022
Figure 1 for Memory-Efficient Backpropagation through Large Linear Layers
Figure 2 for Memory-Efficient Backpropagation through Large Linear Layers
Figure 3 for Memory-Efficient Backpropagation through Large Linear Layers
Figure 4 for Memory-Efficient Backpropagation through Large Linear Layers
Viaarxiv icon

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Add code
Feb 02, 2022
Figure 1 for Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Figure 2 for Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Figure 3 for Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Figure 4 for Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Viaarxiv icon