Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alireza Nik

Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption

Feb 17, 2025

Alireza Nik, Michael A. Riegler, Pål Halvorsen

Figure 1 for Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption

Figure 2 for Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption

Figure 3 for Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption

Figure 4 for Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption

Abstract:Decoding strategies significantly influence the quality and diversity of the generated texts in large language models (LLMs), yet their impact on computational resource consumption, particularly GPU energy usage, is insufficiently studied. This paper investigates the relationship between text generation decoding methods and energy efficiency, focusing on the trade-off between generation quality and GPU energy consumption across diverse tasks and decoding configurations. By benchmarking multiple strategies across different text generation tasks, such as Translation, Code Summarization, and Math Problem Solving, we reveal how selecting appropriate decoding techniques with their tuned hyperparameters affects text quality and has measurable implications for resource utilization, emphasizing the need for balanced optimization. To the best of our knowledge, this study is among the first to explore decoding strategies in LLMs through the lens of energy consumption, offering actionable insights for designing resource-aware applications that maintain high-quality text generation.

Via

Access Paper or Ask Questions