Picture for Ke Hong

Ke Hong

A Survey on Efficient Inference for Large Language Models

Add code
Apr 22, 2024
Viaarxiv icon

FlashDecoding++: Faster Large Language Model Inference on GPUs

Add code
Nov 10, 2023
Viaarxiv icon

Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection

Add code
Jul 17, 2023
Viaarxiv icon