Picture for Zhiying Wu

Zhiying Wu

INT-FlashAttention: Enabling Flash Attention for INT8 Quantization

Add code
Sep 26, 2024
Figure 1 for INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
Figure 2 for INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
Figure 3 for INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
Figure 4 for INT-FlashAttention: Enabling Flash Attention for INT8 Quantization
Viaarxiv icon

Baichuan 2: Open Large-scale Language Models

Add code
Sep 20, 2023
Viaarxiv icon