Picture for Yibo Zhu

Yibo Zhu

ByteDance

QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices

Add code
Jul 02, 2024
Figure 1 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Figure 2 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Figure 3 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Figure 4 for QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices
Viaarxiv icon

CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs

Add code
Nov 17, 2023
Viaarxiv icon

ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs

Add code
Oct 06, 2022
Figure 1 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 2 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 3 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Figure 4 for ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Viaarxiv icon

ByteComp: Revisiting Gradient Compression in Distributed Training

Add code
Jun 06, 2022
Figure 1 for ByteComp: Revisiting Gradient Compression in Distributed Training
Figure 2 for ByteComp: Revisiting Gradient Compression in Distributed Training
Figure 3 for ByteComp: Revisiting Gradient Compression in Distributed Training
Figure 4 for ByteComp: Revisiting Gradient Compression in Distributed Training
Viaarxiv icon

dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training

Add code
May 18, 2022
Figure 1 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Figure 2 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Figure 3 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Figure 4 for dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training
Viaarxiv icon

Aryl: An Elastic Cluster Scheduler for Deep Learning

Add code
Feb 16, 2022
Figure 1 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Figure 2 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Figure 3 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Figure 4 for Aryl: An Elastic Cluster Scheduler for Deep Learning
Viaarxiv icon

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

Add code
Dec 16, 2021
Figure 1 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Figure 2 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Figure 3 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Figure 4 for BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Viaarxiv icon

Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance

Add code
Oct 25, 2021
Figure 1 for Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Figure 2 for Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Figure 3 for Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Figure 4 for Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Viaarxiv icon

Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem

Add code
Sep 18, 2021
Figure 1 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Figure 2 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Figure 3 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Figure 4 for Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Viaarxiv icon

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

Add code
May 22, 2021
Figure 1 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Figure 2 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Figure 3 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Figure 4 for AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Viaarxiv icon