Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

Aug 09, 2023

André Peter Kelm, Lucas Schmidt, Tim Rolff, Christian Wilms, Ehsan Yaghoubi, Simone Frintrop

Figure 1 for High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

Figure 2 for High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

Figure 3 for High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

Figure 4 for High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

Share this with someone who'll enjoy it:

Abstract:In this work, we parallelize high-level features in deep networks to selectively skip or select class-specific features to reduce inference costs. This challenges most deep learning methods due to their limited ability to efficiently and effectively focus on selected class-specific features without retraining. We propose a serial-parallel hybrid architecture with serial generic low-level features and parallel high-level features. This accounts for the fact that many high-level features are class-specific rather than generic, and has connections to recent neuroscientific findings that observe spatially and contextually separated neural activations in the human brain. Our approach provides the unique functionality of cutouts: selecting parts of the network to focus on only relevant subsets of classes without requiring retraining. High performance is maintained, but the cost of inference can be significantly reduced. In some of our examples, up to $75\,\%$ of parameters are skipped and $35\,\%$ fewer GMACs (Giga multiply-accumulate) operations are used as the approach adapts to a change in task complexity. This is important for mobile, industrial, and robotic applications where reducing the number of parameters, the computational complexity, and thus the power consumption can be paramount. Another unique functionality is that it allows processing to be directly influenced by enhancing or inhibiting high-level class-specific features, similar to the mechanism of selective attention in the human brain. This can be relevant for cross-modal applications, the use of semantic prior knowledge, and/or context-aware processing.

View paper on

Share this with someone who'll enjoy it:

Title:High-Level Features Parallelization for Inference Cost Reduction Through Selective Attention

Paper and Code