Picture for Esha Choukse

Esha Choukse

DroidSpeak: Enhancing Cross-LLM Communication

Add code
Nov 05, 2024
Viaarxiv icon

Input-Dependent Power Usage in GPUs

Add code
Sep 26, 2024
Figure 1 for Input-Dependent Power Usage in GPUs
Figure 2 for Input-Dependent Power Usage in GPUs
Figure 3 for Input-Dependent Power Usage in GPUs
Figure 4 for Input-Dependent Power Usage in GPUs
Viaarxiv icon

Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations

Add code
Sep 25, 2024
Figure 1 for Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations
Figure 2 for Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations
Figure 3 for Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations
Figure 4 for Mnemosyne: Parallelization Strategies for Efficiently Serving Multi-Million Context Length LLM Inference Requests Without Approximations
Viaarxiv icon

DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency

Add code
Aug 01, 2024
Figure 1 for DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency
Figure 2 for DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency
Figure 3 for DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency
Figure 4 for DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency
Viaarxiv icon

Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference

Add code
Mar 29, 2024
Viaarxiv icon

POLCA: Power Oversubscription in LLM Cloud Providers

Add code
Aug 24, 2023
Figure 1 for POLCA: Power Oversubscription in LLM Cloud Providers
Figure 2 for POLCA: Power Oversubscription in LLM Cloud Providers
Figure 3 for POLCA: Power Oversubscription in LLM Cloud Providers
Figure 4 for POLCA: Power Oversubscription in LLM Cloud Providers
Viaarxiv icon

PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training

Add code
Jan 26, 2019
Figure 1 for PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training
Figure 2 for PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training
Figure 3 for PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training
Figure 4 for PruneTrain: Gradual Structured Pruning from Scratch for Faster Neural Network Training
Viaarxiv icon