Picture for Junping Zhao

Junping Zhao

LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management

Add code
Oct 01, 2024
Viaarxiv icon

vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving

Add code
Jul 22, 2024
Figure 1 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Figure 2 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Figure 3 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Figure 4 for vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
Viaarxiv icon