Picture for Shengwei Li

Shengwei Li

Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent

Add code
Aug 18, 2023
Viaarxiv icon

Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models

Add code
Jun 21, 2022
Figure 1 for Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models
Figure 2 for Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models
Figure 3 for Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models
Figure 4 for Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models
Viaarxiv icon

EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks

Add code
Oct 18, 2021
Figure 1 for EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks
Figure 2 for EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks
Figure 3 for EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks
Figure 4 for EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks
Viaarxiv icon