Picture for Junchen Jiang

Junchen Jiang

RAGServe: Fast Quality-Aware RAG Systems with Configuration Adaptation

Add code
Dec 13, 2024
Viaarxiv icon

LLMSteer: Improving Long-Context LLM Inference by Steering Attention on Reused Contexts

Add code
Nov 21, 2024
Viaarxiv icon

DroidSpeak: Enhancing Cross-LLM Communication

Add code
Nov 05, 2024
Viaarxiv icon

SwiftQueue: Optimizing Low-Latency Applications with Swift Packet Queuing

Add code
Oct 08, 2024
Viaarxiv icon

CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion

Add code
May 26, 2024
Figure 1 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Figure 2 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Figure 3 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Figure 4 for CacheBlend: Fast Large Language Model Serving with Cached Knowledge Fusion
Viaarxiv icon

Large Language Model Adaptation for Networking

Add code
Feb 04, 2024
Viaarxiv icon

Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network

Add code
Jan 23, 2024
Figure 1 for Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network
Figure 2 for Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network
Figure 3 for Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network
Figure 4 for Chatterbox: Robust Transport for LLM Token Streaming under Unstable Network
Viaarxiv icon

CacheGen: Fast Context Loading for Language Model Applications

Add code
Oct 11, 2023
Figure 1 for CacheGen: Fast Context Loading for Language Model Applications
Figure 2 for CacheGen: Fast Context Loading for Language Model Applications
Figure 3 for CacheGen: Fast Context Loading for Language Model Applications
Figure 4 for CacheGen: Fast Context Loading for Language Model Applications
Viaarxiv icon

Automatic and Efficient Customization of Neural Networks for ML Applications

Add code
Oct 07, 2023
Viaarxiv icon

OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation

Add code
Oct 03, 2023
Figure 1 for OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Figure 2 for OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Figure 3 for OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Figure 4 for OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation
Viaarxiv icon