Picture for Ruihang Lai

Ruihang Lai

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

Add code
Jan 02, 2025
Figure 1 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 2 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 3 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 4 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Viaarxiv icon

WebLLM: A High-Performance In-Browser LLM Inference Engine

Add code
Dec 20, 2024
Viaarxiv icon

XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models

Add code
Nov 22, 2024
Viaarxiv icon

Emerging Platforms Meet Emerging LLMs: A Year-Long Journey of Top-Down Development

Add code
Apr 14, 2024
Viaarxiv icon

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

Add code
Nov 01, 2023
Figure 1 for Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Figure 2 for Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Figure 3 for Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Figure 4 for Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Viaarxiv icon

SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning

Add code
Jul 11, 2022
Figure 1 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Figure 2 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Figure 3 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Figure 4 for SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning
Viaarxiv icon

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

Add code
Jul 09, 2022
Figure 1 for TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Figure 2 for TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Figure 3 for TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Figure 4 for TensorIR: An Abstraction for Automatic Tensorized Program Optimization
Viaarxiv icon

Tensor Program Optimization with Probabilistic Programs

Add code
May 26, 2022
Figure 1 for Tensor Program Optimization with Probabilistic Programs
Figure 2 for Tensor Program Optimization with Probabilistic Programs
Figure 3 for Tensor Program Optimization with Probabilistic Programs
Figure 4 for Tensor Program Optimization with Probabilistic Programs
Viaarxiv icon