Abstract:Recent research indicates that large language models (LLMs) are susceptible to jailbreaking attacks that can generate harmful content. This paper introduces a novel token-level attack method, Adaptive Dense-to-Sparse Constrained Optimization (ADC), which effectively jailbreaks several open-source LLMs. Our approach relaxes the discrete jailbreak optimization into a continuous optimization and progressively increases the sparsity of the optimizing vectors. Consequently, our method effectively bridges the gap between discrete and continuous space optimization. Experimental results demonstrate that our method is more effective and efficient than existing token-level methods. On Harmbench, our method achieves state of the art attack success rate on seven out of eight LLMs. Code will be made available. Trigger Warning: This paper contains model behavior that can be offensive in nature.
Abstract:In the realms of computer vision and natural language processing, Large Vision-Language Models (LVLMs) have become indispensable tools, proficient in generating textual descriptions based on visual inputs. Despite their advancements, our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior rather than the input image. Our empirical experiments underscore the persistence of this bias, as LVLMs often provide confident answers even in the absence of relevant images or given incongruent visual input. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies. Firstly, for tasks such as classification or multi-choice question-answering (QA), we propose a ``calibration'' step through affine transformation to adjust the output distribution. This ``Post-Hoc debias'' approach ensures uniform scores for each answer when the image is absent, serving as an effective regularization technique to alleviate the influence of LLM priors. For more intricate open-ended generation tasks, we extend this method to ``Debias sampling'', drawing inspirations from contrastive decoding methods. Furthermore, our investigation sheds light on the instability of LVLMs across various decoding configurations. Through systematic exploration of different settings, we significantly enhance performance, surpassing reported results and raising concerns about the fairness of existing evaluations. Comprehensive experiments substantiate the effectiveness of our proposed strategies in mitigating biases. These strategies not only prove beneficial in minimizing hallucinations but also contribute to the generation of more helpful and precise illustrations.
Abstract:With the advance of language models, privacy protection is receiving more attention. Training data extraction is therefore of great importance, as it can serve as a potential tool to assess privacy leakage. However, due to the difficulty of this task, most of the existing methods are proof-of-concept and still not effective enough. In this paper, we investigate and benchmark tricks for improving training data extraction using a publicly available dataset. Because most existing extraction methods use a pipeline of generating-then-ranking, i.e., generating text candidates as potential training data and then ranking them based on specific criteria, our research focuses on the tricks for both text generation (e.g., sampling strategy) and text ranking (e.g., token-level criteria). The experimental results show that several previously overlooked tricks can be crucial to the success of training data extraction. Based on the GPT-Neo 1.3B evaluation results, our proposed tricks outperform the baseline by a large margin in most cases, providing a much stronger baseline for future research.
Abstract:Gait recognition aims to identify individuals by recognizing their walking patterns. However, an observation is made that most of the previous gait recognition methods degenerate significantly due to two memorization effects, namely appearance memorization and label noise memorization. To address the problem, for the first time noisy gait recognition is studied, and a cyclic noise-tolerant network (CNTN) is proposed with a cyclic training algorithm, which equips the two parallel networks with explicitly different abilities, namely one forgetting network and one memorizing network. The overall model will not memorize the pattern unless the two different networks both memorize it. Further, a more refined co-teaching constraint is imposed to help the model learn intrinsic patterns which are less influenced by memorization. Also, to address label noise memorization, an adaptive noise detection module is proposed to rule out the samples with high possibility to be noisy from updating the model. Experiments are conducted on the three most popular benchmarks and CNTN achieves state-of-the-art performances. We also reconstruct two noisy gait recognition datasets, and CNTN gains significant improvements (especially 6% improvements on CL setting). CNTN is also compatible with any off-the-shelf backbones and improves them consistently.
Abstract:Gait recognition is a unique biometric technique that can be performed at a long distance non-cooperatively and has broad applications in public safety and intelligent traffic systems. Previous gait works focus more on minimizing the intra-class variance while ignoring the significance in constraining inter-class variance. To this end, we propose a generalized inter-class loss which resolves the inter-class variance from both sample-level feature distribution and class-level feature distribution. Instead of equal penalty strength on pair scores, the proposed loss optimizes sample-level inter-class feature distribution by dynamically adjusting the pairwise weight. Further, in class-level distribution, generalized inter-class loss adds a constraint on the uniformity of inter-class feature distribution, which forces the feature representations to approximate a hypersphere and keep maximal inter-class variance. In addition, the proposed method automatically adjusts the margin between classes which enables the inter-class feature distribution to be more flexible. The proposed method can be generalized to different gait recognition networks and achieves significant improvements. We conduct a series of experiments on CASIA-B and OUMVLP, and the experimental results show that the proposed loss can significantly improve the performance and achieves the state-of-the-art performances.
Abstract:Multivariate time-series forecasting is a critical task for many applications, and graph time-series network is widely studied due to its capability to capture the spatial-temporal correlation simultaneously. However, most existing works focus more on learning with the explicit prior graph structure, while ignoring potential information from the implicit graph structure, yielding incomplete structure modeling. Some recent works attempt to learn the intrinsic or implicit graph structure directly while lacking a way to combine explicit prior structure with implicit structure together. In this paper, we propose Regularized Graph Structure Learning (RGSL) model to incorporate both explicit prior structure and implicit structure together, and learn the forecasting deep networks along with the graph structure. RGSL consists of two innovative modules. First, we derive an implicit dense similarity matrix through node embedding, and learn the sparse graph structure using the Regularized Graph Generation (RGG) based on the Gumbel Softmax trick. Second, we propose a Laplacian Matrix Mixed-up Module (LM3) to fuse the explicit graph and implicit graph together. We conduct experiments on three real-word datasets. Results show that the proposed RGSL model outperforms existing graph forecasting algorithms with a notable margin, while learning meaningful graph structure simultaneously. Our code and models are made publicly available at https://github.com/alipay/RGSL.git.