Picture for Thomas Labonte

Thomas Labonte

Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures

Add code
Apr 16, 2025
Viaarxiv icon