Picture for Jiayi Yao

Jiayi Yao

LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts

Add code
Nov 21, 2024
Viaarxiv icon

Optimal Transmit Signal Design for Multi-Target MIMO Sensing Exploiting Prior Information

Add code
Jun 21, 2024
Viaarxiv icon

CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion

Add code
May 26, 2024
Figure 1 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Figure 2 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Figure 3 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Figure 4 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Viaarxiv icon

White-box Compiler Fuzzing Empowered by Large Language Models

Add code
Oct 24, 2023
Viaarxiv icon

CacheGen: Fast Context Loading for Language Model Applications

Add code
Oct 11, 2023
Viaarxiv icon

A pruning method based on the dissimilarity of angle among channels and filters

Add code
Oct 29, 2022
Figure 1 for A pruning method based on the dissimilarity of angle among channels and filters
Figure 2 for A pruning method based on the dissimilarity of angle among channels and filters
Figure 3 for A pruning method based on the dissimilarity of angle among channels and filters
Figure 4 for A pruning method based on the dissimilarity of angle among channels and filters
Viaarxiv icon

Neural Network Panning: Screening the Optimal Sparse Network Before Training

Add code
Sep 27, 2022
Figure 1 for Neural Network Panning: Screening the Optimal Sparse Network Before Training
Figure 2 for Neural Network Panning: Screening the Optimal Sparse Network Before Training
Figure 3 for Neural Network Panning: Screening the Optimal Sparse Network Before Training
Figure 4 for Neural Network Panning: Screening the Optimal Sparse Network Before Training
Viaarxiv icon