Picture for Vinod Grover

Vinod Grover

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

Add code
Jan 02, 2025
Figure 1 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 2 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 3 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 4 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Viaarxiv icon

Scaling Deep Learning Training with MPMD Pipeline Parallelism

Add code
Dec 18, 2024
Viaarxiv icon

Pattern Matching in AI Compilers and its Formalization (Extended Version)

Add code
Dec 18, 2024
Viaarxiv icon