Picture for Sihwa Lee

Sihwa Lee

Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

Add code
Aug 13, 2023
Viaarxiv icon

Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders

Add code
Nov 20, 2022
Viaarxiv icon

NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference

Add code
Dec 03, 2021
Figure 1 for NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference
Figure 2 for NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference
Figure 3 for NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference
Figure 4 for NN-LUT: Neural Approximation of Non-Linear Operations for Efficient Transformer Inference
Viaarxiv icon