Picture for Shivaram Venkataraman

Shivaram Venkataraman

LV-XAttn: Distributed Cross-Attention for Long Visual Inputs in Multimodal Large Language Models

Add code
Feb 04, 2025
Viaarxiv icon

Scaling Inference-Efficient Language Models

Add code
Jan 30, 2025
Figure 1 for Scaling Inference-Efficient Language Models
Figure 2 for Scaling Inference-Efficient Language Models
Figure 3 for Scaling Inference-Efficient Language Models
Figure 4 for Scaling Inference-Efficient Language Models
Viaarxiv icon

Incremental IVF Index Maintenance for Streaming Vector Search

Add code
Nov 01, 2024
Figure 1 for Incremental IVF Index Maintenance for Streaming Vector Search
Figure 2 for Incremental IVF Index Maintenance for Streaming Vector Search
Figure 3 for Incremental IVF Index Maintenance for Streaming Vector Search
Figure 4 for Incremental IVF Index Maintenance for Streaming Vector Search
Viaarxiv icon

GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval

Add code
Jun 25, 2024
Viaarxiv icon

CHAI: Clustered Head Attention for Efficient LLM Inference

Add code
Mar 12, 2024
Figure 1 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 2 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 3 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 4 for CHAI: Clustered Head Attention for Efficient LLM Inference
Viaarxiv icon

Decoding Speculative Decoding

Add code
Feb 02, 2024
Figure 1 for Decoding Speculative Decoding
Figure 2 for Decoding Speculative Decoding
Figure 3 for Decoding Speculative Decoding
Figure 4 for Decoding Speculative Decoding
Viaarxiv icon

PolyThrottle: Energy-efficient Neural Network Inference on Edge Devices

Add code
Oct 30, 2023
Viaarxiv icon

Does compressing activations help model parallel training?

Add code
Jan 06, 2023
Viaarxiv icon

BagPipe: Accelerating Deep Recommendation Model Training

Add code
Feb 24, 2022
Figure 1 for BagPipe: Accelerating Deep Recommendation Model Training
Figure 2 for BagPipe: Accelerating Deep Recommendation Model Training
Figure 3 for BagPipe: Accelerating Deep Recommendation Model Training
Figure 4 for BagPipe: Accelerating Deep Recommendation Model Training
Viaarxiv icon

Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine

Add code
Feb 04, 2022
Figure 1 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine
Figure 2 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine
Figure 3 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine
Figure 4 for Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine
Viaarxiv icon