Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Saad Ahmed

Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems

May 16, 2024

Pietro Farina, Subrata Biswas, Eren Yıldız, Khakim Akhunov, Saad Ahmed, Bashima Islam, Kasım Sinan Yıldırım

Figure 1 for Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems

Figure 2 for Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems

Figure 3 for Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems

Figure 4 for Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems

Abstract:Batteryless systems frequently face power failures, requiring extra runtime buffers to maintain inference progress and leaving only a memory space for storing ultra-tiny deep neural networks (DNNs). Besides, making these models responsive to stochastic energy harvesting dynamics during inference requires a balance between inference accuracy, latency, and energy overhead. Recent works on compression mostly focus on time and memory, but often ignore energy dynamics or significantly reduce the accuracy of pre-trained DNNs. Existing energy-adaptive inference works modify the architecture of pre-trained models and have significant memory overhead. Thus, energy-adaptive and accurate inference of pre-trained DNNs on batteryless devices with extreme memory constraints is more challenging than traditional microcontrollers. We combat these issues by proposing FreeML, a framework to optimize pre-trained DNN models for memory-efficient and energy-adaptive inference on batteryless systems. FreeML comprises (1) a novel compression technique to reduce the model footprint and runtime memory requirements simultaneously, making them executable on extremely memory-constrained batteryless platforms; and (2) the first early exit mechanism that uses a single exit branch for all exit points to terminate inference at any time, making models energy-adaptive with minimal memory overhead. Our experiments showed that FreeML reduces the model sizes by up to $95 \times$, supports adaptive inference with a $2.03-19.65 \times$ less memory overhead, and provides significant time and energy benefits with only a negligible accuracy drop compared to the state-of-the-art.

* This paper has been selected for publication at the 21st International Conference on Embedded Wireless Systems and Networks (EWSN'24)

Via

Access Paper or Ask Questions