Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Valentino Peluso

Adaptive Test-Time Augmentation for Low-Power CPU

May 13, 2021

Luca Mocerino, Roberto G. Rizzo, Valentino Peluso, Andrea Calimera, Enrico Macii

Figure 1 for Adaptive Test-Time Augmentation for Low-Power CPU

Figure 2 for Adaptive Test-Time Augmentation for Low-Power CPU

Figure 3 for Adaptive Test-Time Augmentation for Low-Power CPU

Figure 4 for Adaptive Test-Time Augmentation for Low-Power CPU

Abstract:Convolutional Neural Networks (ConvNets) are trained offline using the few available data and may therefore suffer from substantial accuracy loss when ported on the field, where unseen input patterns received under unpredictable external conditions can mislead the model. Test-Time Augmentation (TTA) techniques aim to alleviate such common side effect at inference-time, first running multiple feed-forward passes on a set of altered versions of the same input sample, and then computing the main outcome through a consensus of the aggregated predictions. Unfortunately, the implementation of TTA on embedded CPUs introduces latency penalties that limit its adoption on edge applications. To tackle this issue, we propose AdapTTA, an adaptive implementation of TTA that controls the number of feed-forward passes dynamically, depending on the complexity of the input. Experimental results on state-of-the-art ConvNets for image classification deployed on a commercial ARM Cortex-A CPU demonstrate AdapTTA reaches remarkable latency savings, from 1.49X to 2.21X, and hence a higher frame rate compared to static TTA, still preserving the same accuracy gain.

Via

Access Paper or Ask Questions

EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

Dec 20, 2019

Matteo Grimaldi, Valentino Peluso, Andrea Calimera

Figure 1 for EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

Figure 2 for EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

Figure 3 for EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

Figure 4 for EAST: Encoding-Aware Sparse Training for Deep Memory Compression of ConvNets

Abstract:The implementation of Deep Convolutional Neural Networks (ConvNets) on tiny end-nodes with limited non-volatile memory space calls for smart compression strategies capable of shrinking the footprint yet preserving predictive accuracy. There exist a number of strategies for this purpose, from those that play with the topology of the model or the arithmetic precision, e.g. pruning and quantization, to those that operate a model agnostic compression, e.g. weight encoding. The tighter the memory constraint, the higher the probability that these techniques alone cannot meet the requirement, hence more awareness and cooperation across different optimizations become mandatory. This work addresses the issue by introducing EAST, Encoding-Aware Sparse Training, a novel memory-constrained training procedure that leads quantized ConvNets towards deep memory compression. EAST implements an adaptive group pruning designed to maximize the compression rate of the weight encoding scheme (the LZ4 algorithm in this work). If compared to existing methods, EAST meets the memory constraint with lower sparsity, hence ensuring higher accuracy. Results conducted on a state-of-the-art ConvNet (ResNet-9) deployed on a low-power microcontroller (ARM Cortex-M4) validate the proposal.

Via

Access Paper or Ask Questions