Picture for Michael Truong-Le

Michael Truong-Le

Communication Compression for Tensor Parallel LLM Inference

Add code
Nov 14, 2024
Viaarxiv icon