Picture for Chaofan Lin

Chaofan Lin

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

Add code
Feb 06, 2025
Viaarxiv icon

Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Add code
May 30, 2024
Viaarxiv icon