Picture for Liwei Guo

Liwei Guo

The CAP Principle for LLM Serving

Add code
May 18, 2024
Viaarxiv icon

EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models

Add code
Aug 28, 2023
Figure 1 for EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Figure 2 for EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Figure 3 for EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Figure 4 for EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models
Viaarxiv icon

Efficient NLP Inference at the Edge via Elastic Pipelining

Add code
Jul 12, 2022
Figure 1 for Efficient NLP Inference at the Edge via Elastic Pipelining
Figure 2 for Efficient NLP Inference at the Edge via Elastic Pipelining
Figure 3 for Efficient NLP Inference at the Edge via Elastic Pipelining
Figure 4 for Efficient NLP Inference at the Edge via Elastic Pipelining
Viaarxiv icon