Abstract:Direct Preference Optimization (DPO) is widely utilized in the Reinforcement Learning from Human Feedback (RLHF) phase to align Large Language Models (LLMs) with human preferences, thereby enhancing both their harmlessness and efficacy. However, it has been observed that DPO tends to over-optimize for verbosity, which can detrimentally affect both performance and user experience. In this paper, we conduct an in-depth theoretical analysis of DPO's optimization objective and reveal a strong correlation between its implicit reward and data length. This correlation misguides the optimization direction, resulting in length sensitivity during the DPO training and leading to verbosity. To address this issue, we propose a length-desensitization improvement method for DPO, termed LD-DPO. The proposed method aims to desensitize DPO to data length by decoupling explicit length preference, which is relatively insignificant, from the other implicit preferences, thereby enabling more effective learning of the intrinsic preferences. We utilized two settings (Base and Instruct) of Llama2-13B, Llama3-8B, and Qwen2-7B for experimental validation on various benchmarks including MT-Bench and AlpacaEval 2. The experimental results indicate that LD-DPO consistently outperforms DPO and other baseline methods, achieving more concise responses with a 10-40\% reduction in length compared to DPO. We conducted in-depth experimental analyses to demonstrate that LD-DPO can indeed achieve length desensitization and align the model more closely with human-real preferences.
Abstract:Multi-modal Large Language Models (MLLMs) have demonstrated remarkable performance on various visual-language understanding and generation tasks. However, MLLMs occasionally generate content inconsistent with the given images, which is known as "hallucination". Prior works primarily center on evaluating hallucination using standard, unperturbed benchmarks, which overlook the prevalent occurrence of perturbed inputs in real-world scenarios-such as image cropping or blurring-that are critical for a comprehensive assessment of MLLMs' hallucination. In this paper, to bridge this gap, we propose Hallu-PI, the first benchmark designed to evaluate Hallucination in MLLMs within Perturbed Inputs. Specifically, Hallu-PI consists of seven perturbed scenarios, containing 1,260 perturbed images from 11 object types. Each image is accompanied by detailed annotations, which include fine-grained hallucination types, such as existence, attribute, and relation. We equip these annotations with a rich set of questions, making Hallu-PI suitable for both discriminative and generative tasks. Extensive experiments on 12 mainstream MLLMs, such as GPT-4V and Gemini-Pro Vision, demonstrate that these models exhibit significant hallucinations on Hallu-PI, which is not observed in unperturbed scenarios. Furthermore, our research reveals a severe bias in MLLMs' ability to handle different types of hallucinations. We also design two baselines specifically for perturbed scenarios, namely Perturbed-Reminder and Perturbed-ICL. We hope that our study will bring researchers' attention to the limitations of MLLMs when dealing with perturbed inputs, and spur further investigations to address this issue. Our code and datasets are publicly available at https://github.com/NJUNLP/Hallu-PI.
Abstract:Product reviews often contain a large number of implicit aspects and object-attribute co-existence cases. Unfortunately, many existing studies in Aspect-Based Sentiment Analysis (ABSA) have overlooked this issue, which can make it difficult to extract opinions comprehensively and fairly. In this paper, we propose a new task called Entity-Aspect-Opinion-Sentiment Quadruple Extraction (EASQE), which aims to hierarchically decompose aspect terms into entities and aspects to avoid information loss, non-exclusive annotations, and opinion misunderstandings in ABSA tasks. To facilitate research in this new task, we have constructed four datasets (Res14-EASQE, Res15-EASQE, Res16-EASQE, and Lap14-EASQE) based on the SemEval Restaurant and Laptop datasets. We have also proposed a novel two-stage sequence-tagging based Trigger-Opinion framework as the baseline for the EASQE task. Empirical evaluations show that our Trigger-Opinion framework can generate satisfactory EASQE results and can also be applied to other ABSA tasks, significantly outperforming state-of-the-art methods. We have made the four datasets and source code of Trigger-Opinion publicly available to facilitate further research in this area.
Abstract:Large Language Models (LLMs), such as ChatGPT and GPT-4, are designed to provide useful and safe responses. However, adversarial prompts known as 'jailbreaks' can circumvent safeguards, leading LLMs to generate harmful content. Exploring jailbreak prompts can help to better reveal the weaknesses of LLMs and further steer us to secure them. Unfortunately, existing jailbreak methods either suffer from intricate manual design or require optimization on another white-box model, compromising generalization or jailbreak efficiency. In this paper, we generalize jailbreak prompt attacks into two aspects: (1) Prompt Rewriting and (2) Scenario Nesting. Based on this, we propose ReNeLLM, an automatic framework that leverages LLMs themselves to generate effective jailbreak prompts. Extensive experiments demonstrate that ReNeLLM significantly improves the attack success rate while greatly reducing the time cost compared to existing baselines. Our study also reveals the inadequacy of current defense methods in safeguarding LLMs. Finally, we offer detailed analysis and discussion from the perspective of prompt execution priority on the failure of LLMs' defense. We hope that our research can catalyze both the academic community and LLMs vendors towards the provision of safer and more regulated Large Language Models.
Abstract:We study the problem of multimodal fusion in this paper. Recent exchanging-based methods have been proposed for vision-vision fusion, which aim to exchange embeddings learned from one modality to the other. However, most of them project inputs of multimodalities into different low-dimensional spaces and cannot be applied to the sequential input data. To solve these issues, in this paper, we propose a novel exchanging-based multimodal fusion model MuSE for text-vision fusion based on Transformer. We first use two encoders to separately map multimodal inputs into different low-dimensional spaces. Then we employ two decoders to regularize the embeddings and pull them into the same space. The two decoders capture the correlations between texts and images with the image captioning task and the text-to-image generation task, respectively. Further, based on the regularized embeddings, we present CrossTransformer, which uses two Transformer encoders with shared parameters as the backbone model to exchange knowledge between multimodalities. Specifically, CrossTransformer first learns the global contextual information of the inputs in the shallow layers. After that, it performs inter-modal exchange by selecting a proportion of tokens in one modality and replacing their embeddings with the average of embeddings in the other modality. We conduct extensive experiments to evaluate the performance of MuSE on the Multimodal Named Entity Recognition task and the Multimodal Sentiment Analysis task. Our results show the superiority of MuSE against other competitors. Our code and data are provided at https://github.com/RecklessRonan/MuSE.
Abstract:Meta-learning methods have been widely used in few-shot named entity recognition (NER), especially prototype-based methods. However, the Other(O) class is difficult to be represented by a prototype vector because there are generally a large number of samples in the class that have miscellaneous semantics. To solve the problem, we propose MeTNet, which generates prototype vectors for entity types only but not O-class. We design an improved triplet network to map samples and prototype vectors into a low-dimensional space that is easier to be classified and propose an adaptive margin for each entity type. The margin plays as a radius and controls a region with adaptive size in the low-dimensional space. Based on the regions, we propose a new inference procedure to predict the label of a query instance. We conduct extensive experiments in both in-domain and cross-domain settings to show the superiority of MeTNet over other state-of-the-art methods. In particular, we release a Chinese few-shot NER dataset FEW-COMM extracted from a well-known e-commerce platform. To the best of our knowledge, this is the first Chinese few-shot NER dataset. All the datasets and codes are provided at https://github.com/hccngu/MeTNet.
Abstract:Recent years have witnessed the great successes of embedding-based methods in recommender systems. Despite their decent performance, we argue one potential limitation of these methods -- the embedding magnitude has not been explicitly modulated, which may aggravate popularity bias and training instability, hindering the model from making a good recommendation. It motivates us to leverage the embedding normalization in recommendation. By normalizing user/item embeddings to a specific value, we empirically observe impressive performance gains (9\% on average) on four real-world datasets. Although encouraging, we also reveal a serious limitation when applying normalization in recommendation -- the performance is highly sensitive to the choice of the temperature $\tau$ which controls the scale of the normalized embeddings. To fully foster the merits of the normalization while circumvent its limitation, this work studied on how to adaptively set the proper $\tau$. Towards this end, we first make a comprehensive analyses of $\tau$ to fully understand its role on recommendation. We then accordingly develop an adaptive fine-grained strategy Adap-$\tau$ for the temperature with satisfying four desirable properties including adaptivity, personalized, efficiency and model-agnostic. Extensive experiments have been conducted to validate the effectiveness of the proposal. The code is available at \url{https://github.com/junkangwu/Adap_tau}.
Abstract:Learning hyperbolic embeddings for knowledge graph (KG) has gained increasing attention due to its superiority in capturing hierarchies. However, some important operations in hyperbolic space still lack good definitions, making existing methods unable to fully leverage the merits of hyperbolic space. Specifically, they suffer from two main limitations: 1) existing Graph Convolutional Network (GCN) methods in hyperbolic space rely on tangent space approximation, which would incur approximation error in representation learning, and 2) due to the lack of inner product operation definition in hyperbolic space, existing methods can only measure the plausibility of facts (links) with hyperbolic distance, which is difficult to capture complex data patterns. In this work, we contribute: 1) a Full Poincar\'{e} Multi-relational GCN that achieves graph information propagation in hyperbolic space without requiring any approximation, and 2) a hyperbolic generalization of Euclidean inner product that is beneficial to capture both hierarchical and complex patterns. On this basis, we further develop a \textbf{F}ully and \textbf{F}lexible \textbf{H}yperbolic \textbf{R}epresentation framework (\textbf{FFHR}) that is able to transfer recent Euclidean-based advances to hyperbolic space. We demonstrate it by instantiating FFHR with four representative KGC methods. Extensive experiments on benchmark datasets validate the superiority of our FFHRs over their Euclidean counterparts as well as state-of-the-art hyperbolic embedding methods.
Abstract:Recommender system usually suffers from severe popularity bias -- the collected interaction data usually exhibits quite imbalanced or even long-tailed distribution over items. Such skewed distribution may result from the users' conformity to the group, which deviates from reflecting users' true preference. Existing efforts for tackling this issue mainly focus on completely eliminating popularity bias. However, we argue that not all popularity bias is evil. Popularity bias not only results from conformity but also item quality, which is usually ignored by existing methods. Some items exhibit higher popularity as they have intrinsic better property. Blindly removing the popularity bias would lose such important signal, and further deteriorate model performance. To sufficiently exploit such important information for recommendation, it is essential to disentangle the benign popularity bias caused by item quality from the harmful popularity bias caused by conformity. Although important, it is quite challenging as we lack an explicit signal to differentiate the two factors of popularity bias. In this paper, we propose to leverage temporal information as the two factors exhibit quite different patterns along the time: item quality revealing item inherent property is stable and static while conformity that depends on items' recent clicks is highly time-sensitive. Correspondingly, we further propose a novel Time-aware DisEntangled framework (TIDE), where a click is generated from three components namely the static item quality, the dynamic conformity effect, as well as the user-item matching score returned by any recommendation model. Lastly, we conduct interventional inference such that the recommendation can benefit from the benign popularity bias while circumvent the harmful one. Extensive experiments on three real-world datasets demonstrated the effectiveness of TIDE.
Abstract:Knowledge graph completion (KGC) has become a focus of attention across deep learning community owing to its excellent contribution to numerous downstream tasks. Although recently have witnessed a surge of work on KGC, they are still insufficient to accurately capture complex relations, since they adopt the single and static representations. In this work, we propose a novel Disentangled Knowledge Graph Attention Network (DisenKGAT) for KGC, which leverages both micro-disentanglement and macro-disentanglement to exploit representations behind Knowledge graphs (KGs). To achieve micro-disentanglement, we put forward a novel relation-aware aggregation to learn diverse component representation. For macro-disentanglement, we leverage mutual information as a regularization to enhance independence. With the assistance of disentanglement, our model is able to generate adaptive representations in terms of the given scenario. Besides, our work has strong robustness and flexibility to adapt to various score functions. Extensive experiments on public benchmark datasets have been conducted to validate the superiority of DisenKGAT over existing methods in terms of both accuracy and explainability.