Picture for Fuwei Yang

Fuwei Yang

Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE

Add code
Feb 10, 2025
Viaarxiv icon

FTP: A Fine-grained Token-wise Pruner for Large Language Models via Token Routing

Add code
Dec 16, 2024
Viaarxiv icon