Picture for Mengdi Wu

Mengdi Wu

GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

Add code
Jun 24, 2024
Figure 1 for GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Figure 2 for GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Figure 3 for GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Figure 4 for GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Viaarxiv icon

A Multi-Level Superoptimizer for Tensor Programs

Add code
May 09, 2024
Viaarxiv icon

FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

Add code
Feb 29, 2024
Figure 1 for FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Figure 2 for FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Figure 3 for FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Figure 4 for FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Viaarxiv icon

Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks

Add code
Jan 13, 2022
Figure 1 for Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
Figure 2 for Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
Figure 3 for Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
Figure 4 for Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks
Viaarxiv icon