Picture for Jianyu Wei

Jianyu Wei

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon

AFPQ: Asymmetric Floating Point Quantization for LLMs

Add code
Nov 03, 2023
Viaarxiv icon

Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference

Add code
Aug 23, 2023
Viaarxiv icon