Picture for Chuanxiong Guo

Chuanxiong Guo

ByteDance

dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training

Add code
May 18, 2022
Figure 1 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Figure 2 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Figure 3 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Figure 4 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Viaarxiv icon

Aryl: An Elastic Cluster Scheduler for Deep Learning

Add code
Feb 16, 2022
Figure 1 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Figure 2 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Figure 3 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Figure 4 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Viaarxiv icon

Prediction of GPU Failures Under Deep Learning Workloads

Add code
Jan 27, 2022
Figure 1 for Prediction of GPU Failures Under Deep Learning Workloads
Figure 2 for Prediction of GPU Failures Under Deep Learning Workloads
Figure 3 for Prediction of GPU Failures Under Deep Learning Workloads
Figure 4 for Prediction of GPU Failures Under Deep Learning Workloads
Viaarxiv icon

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Add code
Dec 16, 2021
Figure 1 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Figure 2 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Figure 3 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Figure 4 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Viaarxiv icon

Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem

Add code
Sep 18, 2021
Figure 1 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Figure 2 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Figure 3 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Figure 4 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Viaarxiv icon

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

Add code
May 22, 2021
Figure 1 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Figure 2 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Figure 3 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Figure 4 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Viaarxiv icon