Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Jun 01, 2022

Wei Gao, Qinghao Hu, Zhisheng Ye, Peng Sun, Xiaolin Wang, Yingwei Luo, Tianwei Zhang, Yonggang Wen

Figure 1 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Figure 2 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Figure 3 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Figure 4 for Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Share this with someone who'll enjoy it:

Abstract:Deep learning (DL) shows its prosperity in a wide variety of fields. The development of a DL model is a time-consuming and resource-intensive procedure. Hence, dedicated GPU accelerators have been collectively constructed into a GPU datacenter. An efficient scheduler design for such GPU datacenter is crucially important to reduce the operational cost and improve resource utilization. However, traditional approaches designed for big data or high performance computing workloads can not support DL workloads to fully utilize the GPU resources. Recently, substantial schedulers are proposed to tailor for DL workloads in GPU datacenters. This paper surveys existing research efforts for both training and inference workloads. We primarily present how existing schedulers facilitate the respective workloads from the scheduling objectives and resource consumption features. Finally, we prospect several promising future research directions. More detailed summary with the surveyed paper and code links can be found at our project website: https://github.com/S-Lab-System-Group/Awesome-DL-Scheduling-Papers

* Submitted to ACM Computing Surveys

View paper on

Share this with someone who'll enjoy it:

Title:Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Paper and Code