Picture for Donglin Zhuang

Donglin Zhuang

FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Add code
Jan 25, 2024
Viaarxiv icon

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Add code
Sep 19, 2023
Viaarxiv icon

Randomness In Neural Network Training: Characterizing The Impact of Tooling

Add code
Jun 22, 2021
Figure 1 for Randomness In Neural Network Training: Characterizing The Impact of Tooling
Figure 2 for Randomness In Neural Network Training: Characterizing The Impact of Tooling
Figure 3 for Randomness In Neural Network Training: Characterizing The Impact of Tooling
Figure 4 for Randomness In Neural Network Training: Characterizing The Impact of Tooling
Viaarxiv icon

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning

Add code
Nov 20, 2020
Figure 1 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Figure 2 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Figure 3 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Figure 4 for An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning
Viaarxiv icon