FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design

Add code
Jan 25, 2024

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: