Abstract:Large Language Models (LLMs) are widely used in natural language processing but face the risk of jailbreak attacks that maliciously induce them to generate harmful content. Existing jailbreak attacks, including character-level and context-level attacks, mainly focus on the prompt of the plain text without specifically exploring the significant influence of its structure. In this paper, we focus on studying how prompt structure contributes to the jailbreak attack. We introduce a novel structure-level attack method based on tail structures that are rarely used during LLM training, which we refer to as Uncommon Text-Encoded Structure (UTES). We extensively study 12 UTESs templates and 6 obfuscation methods to build an effective automated jailbreak tool named StructuralSleight that contains three escalating attack strategies: Structural Attack, Structural and Character/Context Obfuscation Attack, and Fully Obfuscated Structural Attack. Extensive experiments on existing LLMs show that StructuralSleight significantly outperforms baseline methods. In particular, the attack success rate reaches 94.62\% on GPT-4o, which has not been addressed by state-of-the-art techniques.
Abstract:Sound source localization (SSL) involves estimating the direction of arrival (DOA) of a sound signal. The output space of the DOA estimation is continuous, suggesting that regression may be the most appropriate formulation for DOA. However, in practice, converting the DOA estimation into a classification problem often results in better performance than the regression formulation, since that classification problems are generally easier to model, and are more robust in handling noise and uncertainty than regression problems. In the classification formulation of DOA, the output space is discretized into several intervals, each of which is treated as a class. These classes exhibit strong inter-class correlation, with their mutual-similarity increasing when they approach each other and being ordered. However, this property is not sufficiently explored. To exploit these property, we propose a soft label distribution, named Unbiased Label Distribution (ULD), for eliminating the quantization error of the training target and further taking the inter-class similarity into strong consideration. We further introduce two loss functions, named the Negative Log Absolute Error (NLAE) loss function and {Mean Squared Error loss function without activation (MSE(wo))}, for the soft label family. Finally, we design a new decoding method to map the predicted distribution to sound source locations, called Weighted Adjacent Decoding (WAD). It uses the weighted sum of the probabilities of the peak classes and their adjacent classes in the predicted distribution for decoding. Experimental results show that the proposed method achieves the state-of-the-art performance, and the WAD decoding method is able to even breakthrough the quantization error limits of existing decoding methods.
Abstract:Recently, an end-to-end two-dimensional sound source localization algorithm with ad-hoc microphone arrays formulates the sound source localization problem as a classification problem. The algorithm divides the target indoor space into a set of local areas, and predicts the local area where the speaker locates. However, the local areas are encoded by one-hot code, which may lose the connections between the local areas due to quantization errors. In this paper, we propose a new soft label coding method, named label smoothing, for the classification-based two-dimensional sound source location with ad-hoc microphone arrays. The core idea is to take the geometric connection between the classes into the label coding process.The first one is named static soft label coding (SSLC), which modifies the one-hot codes into soft codes based on the distances between the local areas. Because SSLC is handcrafted which may not be optimal, the second one, named dynamic soft label coding (DSLC), further rectifies SSLC, by learning the soft codes according to the statistics of the predictions produced by the classification-based localization model in the training stage. Experimental results show that the proposed methods can effectively improve the localization accuracy.