Picture for Yifei Cheng

Yifei Cheng

Communication-Efficient Distributed Learning with Local Immediate Error Compensation

Add code
Feb 19, 2024
Figure 1 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation
Figure 2 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation
Figure 3 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation
Figure 4 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation
Viaarxiv icon

DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training

Add code
Feb 28, 2022
Figure 1 for DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Figure 2 for DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Figure 3 for DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Figure 4 for DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Viaarxiv icon

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

Add code
Jun 11, 2020
Figure 1 for STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
Figure 2 for STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
Figure 3 for STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
Figure 4 for STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
Viaarxiv icon

Variance Reduced Local SGD with Lower Communication Complexity

Add code
Dec 30, 2019
Figure 1 for Variance Reduced Local SGD with Lower Communication Complexity
Figure 2 for Variance Reduced Local SGD with Lower Communication Complexity
Figure 3 for Variance Reduced Local SGD with Lower Communication Complexity
Figure 4 for Variance Reduced Local SGD with Lower Communication Complexity
Viaarxiv icon

Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent

Add code
Jun 28, 2019
Figure 1 for Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Figure 2 for Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Figure 3 for Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Figure 4 for Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Viaarxiv icon