Picture for Christina Giannoula

Christina Giannoula

Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization

Add code
Mar 24, 2025
Viaarxiv icon

Seesaw: High-throughput LLM Inference via Model Re-sharding

Add code
Mar 09, 2025
Viaarxiv icon

PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System

Add code
Feb 21, 2025
Viaarxiv icon

Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models

Add code
Aug 13, 2024
Viaarxiv icon

Proteus: Preserving Model Confidentiality during Graph Optimizations

Add code
Apr 18, 2024
Figure 1 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Figure 2 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Figure 3 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Figure 4 for Proteus: Preserving Model Confidentiality during Graph Optimizations
Viaarxiv icon

Accelerating Graph Neural Networks on Real Processing-In-Memory Systems

Add code
Feb 26, 2024
Figure 1 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 2 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 3 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Figure 4 for Accelerating Graph Neural Networks on Real Processing-In-Memory Systems
Viaarxiv icon

The Synergy of Speculative Decoding and Batching in Serving Large Language Models

Add code
Oct 28, 2023
Viaarxiv icon