Picture for Xianzhi Yu

Xianzhi Yu

FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers

Add code
Nov 21, 2024
Viaarxiv icon

FastAttention: Extend FlashAttention2 to NPUs and Low-resource GPUs

Add code
Oct 22, 2024
Viaarxiv icon

FlatQuant: Flatness Matters for LLM Quantization

Add code
Oct 12, 2024
Figure 1 for FlatQuant: Flatness Matters for LLM Quantization
Figure 2 for FlatQuant: Flatness Matters for LLM Quantization
Figure 3 for FlatQuant: Flatness Matters for LLM Quantization
Figure 4 for FlatQuant: Flatness Matters for LLM Quantization
Viaarxiv icon

Pinpointing the Memory Behaviors of DNN Training

Add code
Apr 01, 2021
Figure 1 for Pinpointing the Memory Behaviors of DNN Training
Figure 2 for Pinpointing the Memory Behaviors of DNN Training
Figure 3 for Pinpointing the Memory Behaviors of DNN Training
Figure 4 for Pinpointing the Memory Behaviors of DNN Training
Viaarxiv icon