Abstract:Recently, there has been a growing interest in leveraging Large Language Models (LLMs) for recommendation systems, which usually adapt a pre-trained LLM to the recommendation scenario through supervised fine-tuning (SFT). However, both the pre-training and SFT stages fail to explicitly model the comparative relationships of a user's preferences on different items. To construct a "helpful and harmless" LLM-based recommender, we propose a general framework -- Recommendation with smoothing personalized Preference Optimization (RosePO), which better aligns with customized human values during the post-training stage. Specifically, in addition to the input and chosen response that naturally align with SFT data, we design a rejected sampling strategy tailored for enhancing helpfulness, along with two strategies aimed at mitigating biases to promote harmlessness. To ensure robustness against uncertain labels present in automatically constructed preference data, we introduce a personalized smoothing factor predicted by a preference oracle into the optimization objective. Evaluation on three real-world datasets demonstrates the effectiveness of our method, showcasing not only improved recommendation performance but also mitigation of semantic hallucination and popularity bias.
Abstract:We introduce a novel self-supervised deep clustering approach tailored for unstructured data without requiring prior knowledge of the number of clusters, termed Adaptive Self-supervised Robust Clustering (ASRC). In particular, ASRC adaptively learns the graph structure and edge weights to capture both local and global structural information. The obtained graph enables us to learn clustering-friendly feature representations by an enhanced graph auto-encoder with contrastive learning technique. It further leverages the clustering results adaptively obtained by robust continuous clustering (RCC) to generate prototypes for negative sampling, which can further contribute to promoting consistency among positive pairs and enlarging the gap between positive and negative samples. ASRC obtains the final clustering results by applying RCC to the learned feature representations with their consistent graph structure and edge weights. Extensive experiments conducted on seven benchmark datasets demonstrate the efficacy of ASRC, demonstrating its superior performance over other popular clustering models. Notably, ASRC even outperforms methods that rely on prior knowledge of the number of clusters, highlighting its effectiveness in addressing the challenges of clustering unstructured data.
Abstract:We introduce ``LAMBDA," a novel open-source, code-free multi-agent data analysis system that that harnesses the power of large models. LAMBDA is designed to address data analysis challenges in complex data-driven applications through the use of innovatively designed data agents that operate iteratively and generatively using natural language. At the core of LAMBDA are two key agent roles: the programmer and the inspector, which are engineered to work together seamlessly. Specifically, the programmer generates code based on the user's instructions and domain-specific knowledge, enhanced by advanced models. Meanwhile, the inspector debugs the code when necessary. To ensure robustness and handle adverse scenarios, LAMBDA features a user interface that allows direct user intervention in the operational loop. Additionally, LAMBDA can flexibly integrate external models and algorithms through our knowledge integration mechanism, catering to the needs of customized data analysis. LAMBDA has demonstrated strong performance on various machine learning datasets. It has the potential to enhance data science practice and analysis paradigm by seamlessly integrating human and artificial intelligence, making it more accessible, effective, and efficient for individuals from diverse backgrounds. The strong performance of LAMBDA in solving data science problems is demonstrated in several case studies, which are presented at \url{https://www.polyu.edu.hk/ama/cmfai/lambda.html}.
Abstract:We investigate certified robustness for GNNs under graph injection attacks. Existing research only provides sample-wise certificates by verifying each node independently, leading to very limited certifying performance. In this paper, we present the first collective certificate, which certifies a set of target nodes simultaneously. To achieve it, we formulate the problem as a binary integer quadratic constrained linear programming (BQCLP). We further develop a customized linearization technique that allows us to relax the BQCLP into linear programming (LP) that can be efficiently solved. Through comprehensive experiments, we demonstrate that our collective certification scheme significantly improves certification performance with minimal computational overhead. For instance, by solving the LP within 1 minute on the Citeseer dataset, we achieve a significant increase in the certified ratio from 0.0% to 81.2% when the injected node number is 5% of the graph size. Our step marks a crucial step towards making provable defense more practical.
Abstract:Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL), where a few examples are used to describe a task to the model. However, the performance of ICL varies significantly with the choice of demonstrations, and it is still unclear why this happens or what factors will influence its choice. In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We further proposed a data- and model-dependent demonstration selection method, \textbf{TopK + ConE}, based on the assumption that \textit{the performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples}, resulting in a simple and effective recipe for ICL. Empirically, our method yields consistent improvements in both language understanding and generation tasks with different model scales. Further analyses confirm that, besides the generality and stability under different circumstances, our method provides a unified explanation for the effectiveness of previous methods. Code will be released.
Abstract:Predicting in-hospital mortality for intensive care unit (ICU) patients is key to final clinical outcomes. AI has shown advantaged accuracy but suffers from the lack of explainability. To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions. Furthermore, we introduce an explainable method, namely Layer-Wise Propagation to Transformer, as a proper extension of the LRP method to Transformers, producing explanations over multimodal inputs and revealing the salient features attributed to prediction. Moreover, the contribution of each modality to clinical outcomes can be visualized, assisting clinicians in understanding the reasoning behind decision-making. We construct a multimodal dataset based on MIMIC-III and MIMIC-III Waveform Database Matched Subset. Comprehensive experiments on benchmark datasets demonstrate that our proposed framework can achieve reasonable interpretation with competitive prediction accuracy. In particular, our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
Abstract:Sequential recommendation aims to predict the subsequent items matching user preference based on her/his historical interactions. With the development of Large Language Models (LLMs), there is growing interest in exploring the potential of LLMs for sequential recommendation by framing it as a language modeling task. Prior works represent items in the textual prompts using either ID indexing or text indexing and feed the prompts into LLMs, but falling short of either encapsulating comprehensive world knowledge or exhibiting sufficient sequential understanding. To harness the complementary strengths of traditional recommenders (which encode user behavioral knowledge) and LLMs (which possess world knowledge about items), we propose LLaRA -- a Large Language and Recommendation Assistant framework. Specifically, LLaRA represents items in LLM's input prompts using a novel hybrid approach that integrates ID-based item embeddings from traditional recommenders with textual item features. Viewing the ``sequential behavior of the user'' as a new modality in recommendation, we employ an adapter to bridge the modality gap between ID embeddings of the traditional recommenders and the input space of LLMs. Furthermore, instead of directly exposing the hybrid prompt to LLMs, we apply a curriculum learning approach to gradually ramp up training complexity. We first warm up the LLM with text-only prompting, which aligns more naturally with the LLM's language modeling capabilities. Thereafter, we progressively transition to hybrid prompting, training the adapter to incorporate behavioral knowledge from the traditional sequential recommender into the LLM. Extensive experiments demonstrate the efficacy of LLaRA framework. Our code and data are available at https://github.com/ljy0ustc/LLaRA .
Abstract:Sequential recommendation aims to recommend the next item that matches a user's interest, based on the sequence of items he/she interacted with before. Scrutinizing previous studies, we can summarize a common learning-to-classify paradigm -- given a positive item, a recommender model performs negative sampling to add negative items and learns to classify whether the user prefers them or not, based on his/her historical interaction sequence. Although effective, we reveal two inherent limitations:(1) it may differ from human behavior in that a user could imagine an oracle item in mind and select potential items matching the oracle; and (2) the classification is limited in the candidate pool with noisy or easy supervision from negative samples, which dilutes the preference signals towards the oracle item. Yet, generating the oracle item from the historical interaction sequence is mostly unexplored. To bridge the gap, we reshape sequential recommendation as a learning-to-generate paradigm, which is achieved via a guided diffusion model, termed DreamRec.Specifically, for a sequence of historical items, it applies a Transformer encoder to create guidance representations. Noising target items explores the underlying distribution of item space; then, with the guidance of historical interactions, the denoising process generates an oracle item to recover the positive item, so as to cast off negative sampling and depict the true preference of the user directly. We evaluate the effectiveness of DreamRec through extensive experiments and comparisons with existing methods. Codes and data are open-sourced at https://github.com/YangZhengyi98/DreamRec.
Abstract:Sequential recommendation is to predict the next item of interest for a user, based on her/his interaction history with previous items. In conventional sequential recommenders, a common approach is to model item sequences using discrete IDs, learning representations that encode sequential behaviors and reflect user preferences. Inspired by recent success in empowering large language models (LLMs) to understand and reason over diverse modality data (e.g., image, audio, 3D points), a compelling research question arises: ``Can LLMs understand and work with hidden representations from ID-based sequential recommenders?''.To answer this, we propose a simple framework, RecInterpreter, which examines the capacity of open-source LLMs to decipher the representation space of sequential recommenders. Specifically, with the multimodal pairs (\ie representations of interaction sequence and text narrations), RecInterpreter first uses a lightweight adapter to map the representations into the token embedding space of the LLM. Subsequently, it constructs a sequence-recovery prompt that encourages the LLM to generate textual descriptions for items within the interaction sequence. Taking a step further, we propose a sequence-residual prompt instead, which guides the LLM in identifying the residual item by contrasting the representations before and after integrating this residual into the existing sequence. Empirical results showcase that our RecInterpreter enhances the exemplar LLM, LLaMA, to understand hidden representations from ID-based sequential recommenders, especially when guided by our sequence-residual prompts. Furthermore, RecInterpreter enables LLaMA to instantiate the oracle items generated by generative recommenders like DreamRec, concreting the item a user would ideally like to interact with next. Codes are available at https://github.com/YangZhengyi98/RecInterpreter.
Abstract:In this paper, we propose a randomly projected convex clustering model for clustering a collection of $n$ high dimensional data points in $\mathbb{R}^d$ with $K$ hidden clusters. Compared to the convex clustering model for clustering original data with dimension $d$, we prove that, under some mild conditions, the perfect recovery of the cluster membership assignments of the convex clustering model, if exists, can be preserved by the randomly projected convex clustering model with embedding dimension $m = O(\epsilon^{-2}\log(n))$, where $0 < \epsilon < 1$ is some given parameter. We further prove that the embedding dimension can be improved to be $O(\epsilon^{-2}\log(K))$, which is independent of the number of data points. Extensive numerical experiment results will be presented in this paper to demonstrate the robustness and superior performance of the randomly projected convex clustering model. The numerical results presented in this paper also demonstrate that the randomly projected convex clustering model can outperform the randomly projected K-means model in practice.