Picture for Ang Wang

Ang Wang

SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning

Add code
Aug 13, 2024
Figure 1 for SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning
Figure 2 for SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning
Figure 3 for SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning
Figure 4 for SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning
Viaarxiv icon

ROAM: memory-efficient large DNN training via optimized operator ordering and memory layout

Add code
Oct 30, 2023
Viaarxiv icon

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation

Add code
Feb 01, 2023
Viaarxiv icon

Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training

Add code
Oct 12, 2022
Figure 1 for Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training
Figure 2 for Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training
Figure 3 for Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training
Figure 4 for Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training
Viaarxiv icon

M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining

Add code
Oct 25, 2021
Figure 1 for M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Figure 2 for M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Figure 3 for M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Figure 4 for M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining
Viaarxiv icon

Exploring Sparse Expert Models and Beyond

Add code
Jun 14, 2021
Figure 1 for Exploring Sparse Expert Models and Beyond
Figure 2 for Exploring Sparse Expert Models and Beyond
Figure 3 for Exploring Sparse Expert Models and Beyond
Figure 4 for Exploring Sparse Expert Models and Beyond
Viaarxiv icon

M6: A Chinese Multimodal Pretrainer

Add code
Mar 02, 2021
Figure 1 for M6: A Chinese Multimodal Pretrainer
Figure 2 for M6: A Chinese Multimodal Pretrainer
Figure 3 for M6: A Chinese Multimodal Pretrainer
Figure 4 for M6: A Chinese Multimodal Pretrainer
Viaarxiv icon

EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications

Add code
Nov 23, 2020
Figure 1 for EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications
Figure 2 for EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications
Figure 3 for EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications
Viaarxiv icon