Abstract:Automatic speech recognition (ASR) systems are known to be vulnerable to adversarial attacks. This paper addresses detection and defence against targeted white-box attacks on speech signals for ASR systems. While existing work has utilised diffusion models (DMs) to purify adversarial examples, achieving state-of-the-art results in keyword spotting tasks, their effectiveness for more complex tasks such as sentence-level ASR remains unexplored. Additionally, the impact of the number of forward diffusion steps on performance is not well understood. In this paper, we systematically investigate the use of DMs for defending against adversarial attacks on sentences and examine the effect of varying forward diffusion steps. Through comprehensive experiments on the Mozilla Common Voice dataset, we demonstrate that two forward diffusion steps can completely defend against adversarial attacks on sentences. Moreover, we introduce a novel, training-free approach for detecting adversarial attacks by leveraging a pre-trained DM. Our experimental results show that this method can detect adversarial attacks with high accuracy.
Abstract:Estimating the probability of rare channel conditions is a central challenge in ultra-reliable wireless communication, where random events, such as deep fades, can cause sudden variations in the channel quality. This paper proposes a sample-efficient framework for predicting the statistics of such events by utilizing spatial dependency between channel measurements acquired from various locations. The proposed framework combines radio maps with non-parametric models and extreme value theory (EVT) to estimate rare-event channel statistics under a Bayesian formulation. The framework can be applied to a wide range of problems in wireless communication and is exemplified by rate selection in ultra-reliable communications. Notably, besides simulated data, the proposed framework is also validated with experimental measurements. The results in both cases show that the Bayesian formulation provides significantly better results in terms of throughput compared to baselines that do not leverage measurements from surrounding locations. It is also observed that the models based on EVT are generally more accurate in predicting rare-event statistics than non-parametric models, especially when only a limited number of channel samples are available. Overall, the proposed methods can significantly reduce the number of measurements required to predict rare channel conditions and guarantee reliability.