Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adrian Frischknecht

Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators

Sep 13, 2024

Konstantin Lübeck, Alexander Louis-Ferdinand Jung, Felix Wedlich, Mika Markus Müller, Federico Nicolás Peccia, Felix Thömmes, Jannik Steinmetz, Valentin Biermaier, Adrian Frischknecht, Paul Palomero Bernardo(+1 more)

Abstract:Implementing Deep Neural Networks (DNNs) on resource-constrained edge devices is a challenging task that requires tailored hardware accelerator architectures and a clear understanding of their performance characteristics when executing the intended AI workload. To facilitate this, we present an automated generation approach for fast performance models to accurately estimate the latency of a DNN mapped onto systematically modeled and concisely described accelerator architectures. Using our accelerator architecture description method, we modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. Together with DNN mappings for those modeled architectures, we perform a combined DNN/hardware dependency graph analysis, which enables us, in the best case, to evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup. We outperform regression and analytical models in terms of mean absolute percentage error (MAPE) compared to simulation results, while being several magnitudes faster than an RTL simulation.

* Accepted version for: ACM Transactions on Embedded Computing Systems

Via

Access Paper or Ask Questions

Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

Sep 08, 2022

Christoph Gerum, Adrian Frischknecht, Tobias Hald, Paul Palomero Bernardo, Konstantin Lübeck, Olver Bringmann

Figure 1 for Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

Figure 2 for Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

Figure 3 for Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

Figure 4 for Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

Abstract:The increasing spread of artificial neural networks does not stop at ultralow-power edge devices. However, these very often have high computational demand and require specialized hardware accelerators to ensure the design meets power and performance constraints. The manual optimization of neural networks along with the corresponding hardware accelerators can be very challenging. This paper presents HANNAH (Hardware Accelerator and Neural Network seArcH), a framework for automated and combined hardware/software co-design of deep neural networks and hardware accelerators for resource and power-constrained edge devices. The optimization approach uses an evolution-based search algorithm, a neural network template technique, and analytical KPI models for the configurable UltraTrail hardware accelerator template to find an optimized neural network and accelerator configuration. We demonstrate that HANNAH can find suitable neural networks with minimized power consumption and high accuracy for different audio classification tasks such as single-class wake word detection, multi-class keyword detection, and voice activity detection, which are superior to the related work.

* Accepted Version for: EUROMICRO DSD 2022

Via

Access Paper or Ask Questions

Behavior of Keyword Spotting Networks Under Noisy Conditions

Sep 15, 2021

Anwesh Mohanty, Adrian Frischknecht, Christoph Gerum, Oliver Bringmann

Figure 1 for Behavior of Keyword Spotting Networks Under Noisy Conditions

Figure 2 for Behavior of Keyword Spotting Networks Under Noisy Conditions

Figure 3 for Behavior of Keyword Spotting Networks Under Noisy Conditions

Figure 4 for Behavior of Keyword Spotting Networks Under Noisy Conditions

Abstract:Keyword spotting (KWS) is becoming a ubiquitous need with the advancement in artificial intelligence and smart devices. Recent work in this field have focused on several different architectures to achieve good results on datasets with low to moderate noise. However, the performance of these models deteriorates under high noise conditions as shown by our experiments. In our paper, we present an extensive comparison between state-of-the-art KWS networks under various noisy conditions. We also suggest adaptive batch normalization as a technique to improve the performance of the networks when the noise files are unknown during the training phase. The results of such high noise characterization enable future work in developing models that perform better in the aforementioned conditions.

* ICANN 2021. Lecture Notes in Computer Science, vol 12891, pp 369-378. Springer
* 11 pages, 5 figures, Published in Lecture Notes in Computer Science book series (LNCS, volume 12891)

Via

Access Paper or Ask Questions