Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Sep 05, 2024

Yujie Wang, Shenhan Zhu, Fangcheng Fu, Xupeng Miao, Jie Zhang, Juan Zhu, Fan Hong, Yong Li, Bin Cui

Figure 1 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Figure 2 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Figure 3 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Figure 4 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Share this with someone who'll enjoy it:

Abstract:Recent foundation models are capable of handling multiple machine learning (ML) tasks and multiple data modalities with the unified base model structure and several specialized model components. However, the development of such multi-task (MT) multi-modal (MM) models poses significant model management challenges to existing training systems. Due to the sophisticated model architecture and the heterogeneous workloads of different ML tasks and data modalities, training these models usually requires massive GPU resources and suffers from sub-optimal system efficiency. In this paper, we investigate how to achieve high-performance training of large-scale MT MM models through data heterogeneity-aware model management optimization. The key idea is to decompose the model execution into stages and address the joint optimization problem sequentially, including both heterogeneity-aware workload parallelization and dependency-driven execution scheduling. Based on this, we build a prototype system and evaluate it on various large MT MM models. Experiments demonstrate the superior performance and efficiency of our system, with speedup ratio up to 71% compared to state-of-the-art training systems.

View paper on

Share this with someone who'll enjoy it:

Title:Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Paper and Code