Abstract:Model uncertainty quantification involves measuring and evaluating the uncertainty linked to a model's predictions, helping assess their reliability and confidence. Noise injection is a technique used to enhance the robustness of neural networks by introducing randomness. In this paper, we establish a connection between noise injection and uncertainty quantification from a Bayesian standpoint. We theoretically demonstrate that injecting noise into the weights of a neural network is equivalent to Bayesian inference on a deep Gaussian process. Consequently, we introduce a Monte Carlo Noise Injection (MCNI) method, which involves injecting noise into the parameters during training and performing multiple forward propagations during inference to estimate the uncertainty of the prediction. Through simulation and experiments on regression and classification tasks, our method demonstrates superior performance compared to the baseline model.
Abstract:With the increasing use of deep learning on data collected by non-perfect sensors and in non-perfect environments, the robustness of deep learning systems has become an important issue. A common approach for obtaining robustness to noise has been to train deep learning systems with data augmented with Gaussian noise. In this work, we challenge the common choice of Gaussian noise and explore the possibility of stronger robustness for non-Gaussian impulsive noise, specifically alpha-stable noise. Justified by the Generalized Central Limit Theorem and evidenced by observations in various application areas, alpha-stable noise is widely present in nature. By comparing the testing accuracy of models trained with Gaussian noise and alpha-stable noise on data corrupted by different noise, we find that training with alpha-stable noise is more effective than Gaussian noise, especially when the dataset is corrupted by impulsive noise, thus improving the robustness of the model. The generality of this conclusion is validated through experiments conducted on various deep learning models with image and time series datasets, and other benchmark corrupted datasets. Consequently, we propose a novel data augmentation method that replaces Gaussian noise, which is typically added to the training data, with alpha-stable noise.