Picture for Gennady Pekhimenko

Gennady Pekhimenko

Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization

Add code
Mar 24, 2025
Viaarxiv icon

Seesaw: High-throughput LLM Inference via Model Re-sharding

Add code
Mar 09, 2025
Viaarxiv icon

APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts

Add code
Jun 19, 2024
Viaarxiv icon

Proteus: Preserving Model Confidentiality during Graph Optimizations

Add code
Apr 18, 2024
Figure 1 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Figure 2 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Figure 3 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Figure 4 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Viaarxiv icon

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems

Add code
Feb 26, 2024
Figure 1 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 2 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 3 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 4 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Viaarxiv icon

The Synergy of Speculative Decoding and Batching in Serving Large Language Models

Add code
Oct 28, 2023
Viaarxiv icon

Speeding up Fourier Neural Operators via Mixed Precision

Add code
Jul 27, 2023
Figure 1 for Speeding up Fourier Neural Operators via Mixed Precision
Figure 2 for Speeding up Fourier Neural Operators via Mixed Precision
Figure 3 for Speeding up Fourier Neural Operators via Mixed Precision
Figure 4 for Speeding up Fourier Neural Operators via Mixed Precision
Viaarxiv icon

Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction

Add code
Oct 19, 2022
Figure 1 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Figure 2 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Figure 3 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Figure 4 for Tempo: Accelerating Transformer-Based Model Training through Memory Footprint Reduction
Viaarxiv icon

Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs

Add code
Oct 18, 2022
Figure 1 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Figure 2 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Figure 3 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Figure 4 for Hidet: Task Mapping Programming Paradigm for Deep Learning Tensor Programs
Viaarxiv icon

Optimizing Data Collection in Deep Reinforcement Learning

Add code
Jul 15, 2022
Figure 1 for Optimizing Data Collection in Deep Reinforcement Learning
Figure 2 for Optimizing Data Collection in Deep Reinforcement Learning
Figure 3 for Optimizing Data Collection in Deep Reinforcement Learning
Figure 4 for Optimizing Data Collection in Deep Reinforcement Learning
Viaarxiv icon