Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Toon Vinck

Mitigating multiple single-event upsets during deep neural network inference using fault-aware training

Feb 13, 2025

Toon Vinck, Naïn Jonckers, Gert Dekkers, Jeffrey Prinzie, Peter Karsmakers

Abstract:Deep neural networks (DNNs) are increasingly used in safety-critical applications. Reliable fault analysis and mitigation are essential to ensure their functionality in harsh environments that contain high radiation levels. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a DNN model. Additionally, a fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware. Experimental results show that the FAT methodology improves the tolerance to faults up to a factor 3.

* 7 pages, 4 figures, Topical Workshop on Electronics for Particle Physics

Via

Access Paper or Ask Questions

Single-Event Upset Analysis of a Systolic Array based Deep Neural Network Accelerator

May 24, 2024

Naïn Jonckers, Toon Vinck, Gert Dekkers, Peter Karsmakers, Jeffrey Prinzie

Figure 1 for Single-Event Upset Analysis of a Systolic Array based Deep Neural Network Accelerator

Figure 2 for Single-Event Upset Analysis of a Systolic Array based Deep Neural Network Accelerator

Figure 3 for Single-Event Upset Analysis of a Systolic Array based Deep Neural Network Accelerator

Figure 4 for Single-Event Upset Analysis of a Systolic Array based Deep Neural Network Accelerator

Abstract:Deep Neural Network (DNN) accelerators are extensively used to improve the computational efficiency of DNNs, but are prone to faults through Single-Event Upsets (SEUs). In this work, we present an in-depth analysis of the impact of SEUs on a Systolic Array (SA) based DNN accelerator. A fault injection campaign is performed through a Register-Transfer Level (RTL) based simulation environment to improve the observability of each hardware block, including the SA itself as well as the post-processing pipeline. From this analysis, we present the sensitivity, independent of a DNN model architecture, for various flip-flop groups both in terms of fault propagation probability and fault magnitude. This allows us to draw detailed conclusions and determine optimal mitigation strategies.

* This work has been submitted to RADECS 2024 for possible publication

Via

Access Paper or Ask Questions