Picture for Longwei Zou

Longwei Zou

InstCache: A Predictive Cache for LLM Serving

Add code
Nov 21, 2024
Viaarxiv icon

CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers

Add code
Apr 10, 2024
Viaarxiv icon

A Multi-Level Framework for Accelerating Training Transformer Models

Add code
Apr 07, 2024
Viaarxiv icon