Picture for Qinya Li

Qinya Li

AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference

Add code
Jan 04, 2025
Figure 1 for AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
Figure 2 for AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
Figure 3 for AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
Figure 4 for AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
Viaarxiv icon

Edge-Cloud Routing for Text-to-Image Model with Token-Level Multi-Metric Prediction

Add code
Nov 21, 2024
Figure 1 for Edge-Cloud Routing for Text-to-Image Model with Token-Level Multi-Metric Prediction
Figure 2 for Edge-Cloud Routing for Text-to-Image Model with Token-Level Multi-Metric Prediction
Figure 3 for Edge-Cloud Routing for Text-to-Image Model with Token-Level Multi-Metric Prediction
Figure 4 for Edge-Cloud Routing for Text-to-Image Model with Token-Level Multi-Metric Prediction
Viaarxiv icon

One-Time Model Adaptation to Heterogeneous Clients: An Intra-Client and Inter-Image Attention Design

Add code
Nov 11, 2022
Viaarxiv icon