Picture for Vinod Grover

Vinod Grover

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving

Add code
Jan 02, 2025
Figure 1 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 2 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 3 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Figure 4 for FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving
Viaarxiv icon

Scaling Deep Learning Training with MPMD Pipeline Parallelism

Add code
Dec 18, 2024
Figure 1 for Scaling Deep Learning Training with MPMD Pipeline Parallelism
Figure 2 for Scaling Deep Learning Training with MPMD Pipeline Parallelism
Figure 3 for Scaling Deep Learning Training with MPMD Pipeline Parallelism
Figure 4 for Scaling Deep Learning Training with MPMD Pipeline Parallelism
Viaarxiv icon

Pattern Matching in AI Compilers and its Formalization (Extended Version)

Add code
Dec 18, 2024
Viaarxiv icon