Picture for Jeongin Bae

Jeongin Bae

To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability

Add code
May 29, 2024
Figure 1 for To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability
Figure 2 for To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability
Figure 3 for To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability
Figure 4 for To FP8 and Back Again: Quantifying the Effects of Reducing Precision on LLM Training Stability
Viaarxiv icon

HyperCLOVA X Technical Report

Add code
Apr 13, 2024
Viaarxiv icon

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization

Add code
Feb 28, 2024
Figure 1 for No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
Figure 2 for No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
Figure 3 for No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
Figure 4 for No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
Viaarxiv icon

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

Add code
Oct 08, 2022
Figure 1 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Figure 2 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Figure 3 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Figure 4 for AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
Viaarxiv icon