Picture for Vijay Gadepally

Vijay Gadepally

LLM Inference Serving: Survey of Recent Advances and Opportunities

Add code
Jul 17, 2024
Viaarxiv icon

Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference

Add code
Mar 19, 2024
Viaarxiv icon

Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale

Add code
Feb 25, 2024
Figure 1 for Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
Figure 2 for Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
Figure 3 for Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
Figure 4 for Sustainable Supercomputing for AI: GPU Power Capping at HPC Scale
Viaarxiv icon

Lincoln AI Computing Survey (LAICS) Update

Add code
Oct 13, 2023
Viaarxiv icon

From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference

Add code
Oct 04, 2023
Viaarxiv icon

A Green(er) World for A.I

Add code
Jan 27, 2023
Viaarxiv icon

Building Heterogeneous Cloud System for Machine Learning Inference

Add code
Oct 15, 2022
Figure 1 for Building Heterogeneous Cloud System for Machine Learning Inference
Figure 2 for Building Heterogeneous Cloud System for Machine Learning Inference
Figure 3 for Building Heterogeneous Cloud System for Machine Learning Inference
Figure 4 for Building Heterogeneous Cloud System for Machine Learning Inference
Viaarxiv icon

An Evaluation of Low Overhead Time Series Preprocessing Techniques for Downstream Machine Learning

Add code
Sep 12, 2022
Figure 1 for An Evaluation of Low Overhead Time Series Preprocessing Techniques for Downstream Machine Learning
Viaarxiv icon

RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances

Add code
Jul 28, 2022
Figure 1 for RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Figure 2 for RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Figure 3 for RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Figure 4 for RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Viaarxiv icon

Developing a Series of AI Challenges for the United States Department of the Air Force

Add code
Jul 14, 2022
Figure 1 for Developing a Series of AI Challenges for the United States Department of the Air Force
Figure 2 for Developing a Series of AI Challenges for the United States Department of the Air Force
Figure 3 for Developing a Series of AI Challenges for the United States Department of the Air Force
Figure 4 for Developing a Series of AI Challenges for the United States Department of the Air Force
Viaarxiv icon