Picture for Matei Zaharia

Matei Zaharia

HashAttention: Semantic Sparsity for Faster Inference

Add code
Dec 19, 2024
Viaarxiv icon

MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs

Add code
Nov 18, 2024
Viaarxiv icon

Drowning in Documents: Consequences of Scaling Reranker Inference

Add code
Nov 18, 2024
Viaarxiv icon

Long Context RAG Performance of Large Language Models

Add code
Nov 05, 2024
Figure 1 for Long Context RAG Performance of Large Language Models
Figure 2 for Long Context RAG Performance of Large Language Models
Figure 3 for Long Context RAG Performance of Large Language Models
Viaarxiv icon

ElasticTok: Adaptive Tokenization for Image and Video

Add code
Oct 10, 2024
Figure 1 for ElasticTok: Adaptive Tokenization for Image and Video
Figure 2 for ElasticTok: Adaptive Tokenization for Image and Video
Figure 3 for ElasticTok: Adaptive Tokenization for Image and Video
Figure 4 for ElasticTok: Adaptive Tokenization for Image and Video
Viaarxiv icon

Text2SQL is Not Enough: Unifying AI and Databases with TAG

Add code
Aug 27, 2024
Viaarxiv icon

Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design

Add code
Jul 23, 2024
Figure 1 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 2 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 3 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Figure 4 for Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design
Viaarxiv icon

LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data

Add code
Jul 16, 2024
Figure 1 for LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data
Figure 2 for LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data
Figure 3 for LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data
Figure 4 for LOTUS: Enabling Semantic Queries with LLMs Over Tables of Unstructured and Structured Data
Viaarxiv icon

Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

Add code
Jun 17, 2024
Viaarxiv icon

Generating Probabilistic Scenario Programs from Natural Language

Add code
May 03, 2024
Viaarxiv icon