Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xingwei Wang

RAGAR: Retrieval Augment Personalized Image Generation Guided by Recommendation

May 03, 2025

Run Ling, Wenji Wang, Yuting Liu, Guibing Guo, Linying Jiang, Xingwei Wang

Abstract:Personalized image generation is crucial for improving the user experience, as it renders reference images into preferred ones according to user visual preferences. Although effective, existing methods face two main issues. First, existing methods treat all items in the user historical sequence equally when extracting user preferences, overlooking the varying semantic similarities between historical items and the reference item. Disproportionately high weights for low-similarity items distort users' visual preferences for the reference item. Second, existing methods heavily rely on consistency between generated and reference images to optimize the generation, which leads to underfitting user preferences and hinders personalization. To address these issues, we propose Retrieval Augment Personalized Image GenerAtion guided by Recommendation (RAGAR). Our approach uses a retrieval mechanism to assign different weights to historical items according to their similarities to the reference item, thereby extracting more refined users' visual preferences for the reference item. Then we introduce a novel rank task based on the multi-modal ranking model to optimize the personalization of the generated images instead of forcing depend on consistency. Extensive experiments and human evaluations on three real-world datasets demonstrate that RAGAR achieves significant improvements in both personalization and semantic metrics compared to five baselines.

Via

Access Paper or Ask Questions

Data Augmentation as Free Lunch: Exploring the Test-Time Augmentation for Sequential Recommendation

Apr 07, 2025

Yizhou Dang, Yuting Liu, Enneng Yang, Minhan Huang, Guibing Guo, Jianzhe Zhao, Xingwei Wang

Abstract:Data augmentation has become a promising method of mitigating data sparsity in sequential recommendation. Existing methods generate new yet effective data during model training to improve performance. However, deploying them requires retraining, architecture modification, or introducing additional learnable parameters. The above steps are time-consuming and costly for well-trained models, especially when the model scale becomes large. In this work, we explore the test-time augmentation (TTA) for sequential recommendation, which augments the inputs during the model inference and then aggregates the model's predictions for augmented data to improve final accuracy. It avoids significant time and cost overhead from loss calculation and backward propagation. We first experimentally disclose the potential of existing augmentation operators for TTA and find that the Mask and Substitute consistently achieve better performance. Further analysis reveals that these two operators are effective because they retain the original sequential pattern while adding appropriate perturbations. Meanwhile, we argue that these two operators still face time-consuming item selection or interference information from mask tokens. Based on the analysis and limitations, we present TNoise and TMask. The former injects uniform noise into the original representation, avoiding the computational overhead of item selection. The latter blocks mask token from participating in model calculations or directly removes interactions that should have been replaced with mask tokens. Comprehensive experiments demonstrate the effectiveness, efficiency, and generalizability of our method. We provide an anonymous implementation at https://github.com/KingGugu/TTA4SR.

* Accepted by SIGIR 2025

Via

Access Paper or Ask Questions

Can LLM-Driven Hard Negative Sampling Empower Collaborative Filtering? Findings and Potentials

Apr 07, 2025

Chu Zhao, Enneng Yang, Yuting Liu, Jianzhe Zhao, Guibing Guo, Xingwei Wang

Abstract:Hard negative samples can accelerate model convergence and optimize decision boundaries, which is key to improving the performance of recommender systems. Although large language models (LLMs) possess strong semantic understanding and generation capabilities, systematic research has not yet been conducted on how to generate hard negative samples effectively. To fill this gap, this paper introduces the concept of Semantic Negative Sampling and exploreshow to optimize LLMs for high-quality, hard negative sampling. Specifically, we design an experimental pipeline that includes three main modules, profile generation, semantic negative sampling, and semantic alignment, to verify the potential of LLM-driven hard negative sampling in enhancing the accuracy of collaborative filtering (CF). Experimental results indicate that hard negative samples generated based on LLMs, when semantically aligned and integrated into CF, can significantly improve CF performance, although there is still a certain gap compared to traditional negative sampling methods. Further analysis reveals that this gap primarily arises from two major challenges: noisy samples and lack of behavioral constraints. To address these challenges, we propose a framework called HNLMRec, based on fine-tuning LLMs supervised by collaborative signals. Experimental results show that this framework outperforms traditional negative sampling and other LLM-driven recommendation methods across multiple datasets, providing new solutions for empowering traditional RS with LLMs. Additionally, we validate the excellent generalization ability of the LLM-based semantic negative sampling method on new datasets, demonstrating its potential in alleviating issues such as data sparsity, popularity bias, and the problem of false hard negative samples. Our implementation code is available at https://github.com/user683/HNLMRec.

* 11 pages

Via

Access Paper or Ask Questions

Distributionally Robust Graph Out-of-Distribution Recommendation via Diffusion Model

Jan 26, 2025

Chu Zhao, Enneng Yang, Yuliang Liang, Jianzhe Zhao, Guibing Guo, Xingwei Wang

Abstract:The distributionally robust optimization (DRO)-based graph neural network methods improve recommendation systems' out-of-distribution (OOD) generalization by optimizing the model's worst-case performance. However, these studies fail to consider the impact of noisy samples in the training data, which results in diminished generalization capabilities and lower accuracy. Through experimental and theoretical analysis, this paper reveals that current DRO-based graph recommendation methods assign greater weight to noise distribution, leading to model parameter learning being dominated by it. When the model overly focuses on fitting noise samples in the training data, it may learn irrelevant or meaningless features that cannot be generalized to OOD data. To address this challenge, we design a Distributionally Robust Graph model for OOD recommendation (DRGO). Specifically, our method first employs a simple and effective diffusion paradigm to alleviate the noisy effect in the latent space. Additionally, an entropy regularization term is introduced in the DRO objective function to avoid extreme sample weights in the worst-case distribution. Finally, we provide a theoretical proof of the generalization error bound of DRGO as well as a theoretical analysis of how our approach mitigates noisy sample effects, which helps to better understand the proposed framework from a theoretical perspective. We conduct extensive experiments on four datasets to evaluate the effectiveness of our framework against three typical distribution shifts, and the results demonstrate its superiority in both independently and identically distributed distributions (IID) and OOD.

* 14 pages, Accepted by WWW'25

Via

Access Paper or Ask Questions

Self-supervised Hierarchical Representation for Medication Recommendation

Nov 05, 2024

Yuliang Liang, Yuting Liu, Yizhou Dang, Enneng Yang, Guibing Guo, Wei Cai, Jianzhe Zhao, Xingwei Wang

Figure 1 for Self-supervised Hierarchical Representation for Medication Recommendation

Figure 2 for Self-supervised Hierarchical Representation for Medication Recommendation

Figure 3 for Self-supervised Hierarchical Representation for Medication Recommendation

Figure 4 for Self-supervised Hierarchical Representation for Medication Recommendation

Abstract:Medication recommender is to suggest appropriate medication combinations based on a patient's health history, e.g., diagnoses and procedures. Existing works represent different diagnoses/procedures well separated by one-hot encodings. However, they ignore the latent hierarchical structures of these medical terms, undermining the generalization performance of the model. For example, "Respiratory Diseases", "Chronic Respiratory Diseases" and "Chronic Bronchiti" have a hierarchical relationship, progressing from general to specific. To address this issue, we propose a novel hierarchical encoder named HIER to hierarchically represent diagnoses and procedures, which is based on standard medical codes and compatible with any existing methods. Specifically, the proposed method learns relation embedding with a self-supervised objective for incorporating the neighbor hierarchical structure. Additionally, we develop the position encoding to explicitly introduce global hierarchical position. Extensive experiments demonstrate significant and consistent improvements in recommendation accuracy across four baselines and two real-world clinical datasets.

Via

Access Paper or Ask Questions

Double-Side Delay Alignment Modulation for Multi-User Millimeter Wave and TeraHertz Communications

Oct 22, 2024

Xingwei Wang, Haiquan Lu, Jieni Zhang, Yong Zeng

Abstract:Delay alignment modulation (DAM) is an innovative broadband modulation technique well suited for millimeter wave (mmWave) and terahertz (THz) massive multiple-input multiple-output (MIMO) communication systems. Leveraging the high spatial resolution and sparsity of multi-path channels, DAM mitigates inter-symbol interference (ISI) effectively, by aligning all multi-path components through a combination of delay pre/post-compensation and path-based beamforming. As such, ISI is eliminated while preserving multi-path power gains. In this paper, we explore multi-user double-side DAM with both delay pre-compensation at the transmitter and post-compensation at the receiver, contrasting with prior one-side DAM that primarily focuses on delay pre-compensation only. Firstly, we reveal the constraint for the introduced delays and the delay pre/post-compensation vectors tailored for multi-user double-side DAM, given a specific number of delay pre/post-compensations. Furthermore, we show that as long as the number of base station (BS)/user equipment (UE) antennas is sufficiently large, single-side DAM, where delay compensation is only performed at the BS/UE, is preferred than double-side DAM since the former results in less ISI to be spatially eliminated. Next, we propose two low-complexity path-based beamforming strategies based on the eigen-beamforming transmission and ISI-zero forcing (ZF) principles, respectively, based on which the achievable sum rates are studied. Simulation results verify that with sufficiently large BS/UE antennas, single-side DAM is sufficient. Furthermore, compared to the benchmark scheme of orthogonal frequency division multiplexing (OFDM), multi-user BS-side DAM achieves higher spectral efficiency and/or lower peak-to-average power ratio (PAPR).

* 12 pages, 7 figures

Via

Access Paper or Ask Questions

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

Oct 18, 2024

Enneng Yang, Li Shen, Zhenyi Wang, Guibing Guo, Xingwei Wang, Xiaocun Cao, Jie Zhang, Dacheng Tao

Abstract:Model merging-based multitask learning (MTL) offers a promising approach for performing MTL by merging multiple expert models without requiring access to raw training data. However, in this paper, we examine the merged model's representation distribution and uncover a critical issue of "representation bias". This bias arises from a significant distribution gap between the representations of the merged and expert models, leading to the suboptimal performance of the merged MTL model. To address this challenge, we first propose a representation surgery solution called Surgery. Surgery is a lightweight, task-specific module that aligns the final layer representations of the merged model with those of the expert models, effectively alleviating bias and improving the merged model's performance. Despite these improvements, a performance gap remains compared to the traditional MTL method. Further analysis reveals that representation bias phenomena exist at each layer of the merged model, and aligning representations only in the last layer is insufficient for fully reducing systemic bias because biases introduced at each layer can accumulate and interact in complex ways. To tackle this, we then propose a more comprehensive solution, deep representation surgery (also called SurgeryV2), which mitigates representation bias across all layers, and thus bridges the performance gap between model merging-based MTL and traditional MTL. Finally, we design an unsupervised optimization objective to optimize both the Surgery and SurgeryV2 modules. Our experimental results show that incorporating these modules into state-of-the-art (SOTA) model merging schemes leads to significant performance gains. Notably, our SurgeryV2 scheme reaches almost the same level as individual expert models or the traditional MTL model. The code is available at \url{https://github.com/EnnengYang/SurgeryV2}.

* This paper is an extended version of our previous work [arXiv:2402.02705] presented at ICML 2024

Via

Access Paper or Ask Questions

CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendation

Aug 20, 2024

Yuting Liu, Jinghao Zhang, Yizhou Dang, Yuliang Liang, Qiang Liu, Guibing Guo, Jianzhe Zhao, Xingwei Wang

Figure 1 for CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendation

Figure 2 for CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendation

Figure 3 for CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendation

Figure 4 for CoRA: Collaborative Information Perception by Large Language Model's Weights for Recommendation

Abstract:Involving collaborative information in Large Language Models (LLMs) is a promising technique for adapting LLMs for recommendation. Existing methods achieve this by concatenating collaborative features with text tokens into a unified sequence input and then fine-tuning to align these features with LLM's input space. Although effective, in this work, we identify two limitations when adapting LLMs to recommendation tasks, which hinder the integration of general knowledge and collaborative information, resulting in sub-optimal recommendation performance. (1) Fine-tuning LLM with recommendation data can undermine its inherent world knowledge and fundamental competencies, which are crucial for interpreting and inferring recommendation text. (2) Incorporating collaborative features into textual prompts disrupts the semantics of the original prompts, preventing LLM from generating appropriate outputs. In this paper, we propose a new paradigm, CoRA (an acronym for Collaborative LoRA), with a collaborative weights generator. Rather than input space alignment, this method aligns collaborative information with LLM's parameter space, representing them as incremental weights to update LLM's output. This way, LLM perceives collaborative information without altering its general knowledge and text inference capabilities. Specifically, we employ a collaborative filtering model to extract user and item embeddings, converting them into collaborative weights with low-rank properties through the collaborative weights generator. We then merge the collaborative weights into LLM's weights, enabling LLM to perceive the collaborative signals and generate personalized recommendations without fine-tuning or extra collaborative tokens in prompts. Extensive experiments confirm that CoRA effectively integrates collaborative information into LLM, enhancing recommendation performance.

Via

Access Paper or Ask Questions

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Aug 15, 2024

Enneng Yang, Li Shen, Guibing Guo, Xingwei Wang, Xiaochun Cao, Jie Zhang, Dacheng Tao

Figure 1 for Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Figure 2 for Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Figure 3 for Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Figure 4 for Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Abstract:Model merging is an efficient empowerment technique in the machine learning community that does not require the collection of raw training data and does not require expensive computation. As model merging becomes increasingly prevalent across various fields, it is crucial to understand the available model merging techniques comprehensively. However, there is a significant gap in the literature regarding a systematic and thorough review of these techniques. This survey provides a comprehensive overview of model merging methods and theories, their applications in various domains and settings, and future research directions. Specifically, we first propose a new taxonomic approach that exhaustively discusses existing model merging methods. Secondly, we discuss the application of model merging techniques in large language models, multimodal large language models, and 10+ machine learning subfields, including continual learning, multi-task learning, few-shot learning, etc. Finally, we highlight the remaining challenges of model merging and discuss future research directions. A comprehensive list of papers about model merging is available at \url{https://github.com/EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications}.

Via

Access Paper or Ask Questions

Symmetric Graph Contrastive Learning against Noisy Views for Recommendation

Aug 03, 2024

Chu Zhao, Enneng Yang, Yuliang Liang, Jianzhe Zhao, Guibing Guo, Xingwei Wang

Abstract:Graph Contrastive Learning (GCL) leverages data augmentation techniques to produce contrasting views, enhancing the accuracy of recommendation systems through learning the consistency between contrastive views. However, existing augmentation methods, such as directly perturbing interaction graph (e.g., node/edge dropout), may interfere with the original connections and generate poor contrasting views, resulting in sub-optimal performance. In this paper, we define the views that share only a small amount of information with the original graph due to poor data augmentation as noisy views (i.e., the last 20% of the views with a cosine similarity value less than 0.1 to the original view). We demonstrate through detailed experiments that noisy views will significantly degrade recommendation performance. Further, we propose a model-agnostic Symmetric Graph Contrastive Learning (SGCL) method with theoretical guarantees to address this issue. Specifically, we introduce symmetry theory into graph contrastive learning, based on which we propose a symmetric form and contrast loss resistant to noisy interference. We provide theoretical proof that our proposed SGCL method has a high tolerance to noisy views. Further demonstration is given by conducting extensive experiments on three real-world datasets. The experimental results demonstrate that our approach substantially increases recommendation accuracy, with relative improvements reaching as high as 12.25% over nine other competing models. These results highlight the efficacy of our method.

* 24 pages, submitted to TOIS

Via

Access Paper or Ask Questions