Picture for Jiangang Kong

Jiangang Kong

CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers

Add code
Apr 10, 2024
Viaarxiv icon