Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jizhi Zhang

Boosting Parameter Efficiency in LLM-Based Recommendation through Sophisticated Pruning

Jul 09, 2025

Shanle Zheng, Keqin Bao, Jizhi Zhang, Yang Zhang, Fuli Feng, Xiangnan He

Abstract:LLM-based recommender systems have made significant progress; however, the deployment cost associated with the large parameter volume of LLMs still hinders their real-world applications. This work explores parameter pruning to improve parameter efficiency while maintaining recommendation quality, thereby enabling easier deployment. Unlike existing approaches that focus primarily on inter-layer redundancy, we uncover intra-layer redundancy within components such as self-attention and MLP modules. Building on this analysis, we propose a more fine-grained pruning approach that integrates both intra-layer and layer-wise pruning. Specifically, we introduce a three-stage pruning strategy that progressively prunes parameters at different levels and parts of the model, moving from intra-layer to layer-wise pruning, or from width to depth. Each stage also includes a performance restoration step using distillation techniques, helping to strike a balance between performance and parameter efficiency. Empirical results demonstrate the effectiveness of our approach: across three datasets, our models achieve an average of 88% of the original model's performance while pruning more than 95% of the non-embedding parameters. This underscores the potential of our method to significantly reduce resource requirements without greatly compromising recommendation quality. Our code will be available at: https://github.com/zheng-sl/PruneRec

Via

Access Paper or Ask Questions

Leveraging Memory Retrieval to Enhance LLM-based Generative Recommendation

Dec 23, 2024

Chengbing Wang, Yang Zhang, Fengbin Zhu, Jizhi Zhang, Tianhao Shi, Fuli Feng

Figure 1 for Leveraging Memory Retrieval to Enhance LLM-based Generative Recommendation

Figure 2 for Leveraging Memory Retrieval to Enhance LLM-based Generative Recommendation

Figure 3 for Leveraging Memory Retrieval to Enhance LLM-based Generative Recommendation

Figure 4 for Leveraging Memory Retrieval to Enhance LLM-based Generative Recommendation

Abstract:Leveraging Large Language Models (LLMs) to harness user-item interaction histories for item generation has emerged as a promising paradigm in generative recommendation. However, the limited context window of LLMs often restricts them to focusing on recent user interactions only, leading to the neglect of long-term interests involved in the longer histories. To address this challenge, we propose a novel Automatic Memory-Retrieval framework (AutoMR), which is capable of storing long-term interests in the memory and extracting relevant information from it for next-item generation within LLMs. Extensive experimental results on two real-world datasets demonstrate the effectiveness of our proposed AutoMR framework in utilizing long-term interests for generative recommendation.

Via

Access Paper or Ask Questions

Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

Oct 30, 2024

Yang Zhang, Juntao You, Yimeng Bai, Jizhi Zhang, Keqin Bao, Wenjie Wang, Tat-Seng Chua

Figure 1 for Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

Figure 2 for Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

Figure 3 for Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

Figure 4 for Causality-Enhanced Behavior Sequence Modeling in LLMs for Personalized Recommendation

Abstract:Recent advancements in recommender systems have focused on leveraging Large Language Models (LLMs) to improve user preference modeling, yielding promising outcomes. However, current LLM-based approaches struggle to fully leverage user behavior sequences, resulting in suboptimal preference modeling for personalized recommendations. In this study, we propose a novel Counterfactual Fine-Tuning (CFT) method to address this issue by explicitly emphasizing the role of behavior sequences when generating recommendations. Specifically, we employ counterfactual reasoning to identify the causal effects of behavior sequences on model output and introduce a task that directly fits the ground-truth labels based on these effects, achieving the goal of explicit emphasis. Additionally, we develop a token-level weighting mechanism to adjust the emphasis strength for different item tokens, reflecting the diminishing influence of behavior sequences from earlier to later tokens during predicting an item. Extensive experiments on real-world datasets demonstrate that CFT effectively improves behavior sequence modeling. Our codes are available at https://github.com/itsmeyjt/CFT.

Via

Access Paper or Ask Questions

Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning

Oct 30, 2024

Keqin Bao, Ming Yan, Yang Zhang, Jizhi Zhang, Wenjie Wang, Fuli Feng, Xiangnan He

Figure 1 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning

Figure 2 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning

Figure 3 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning

Figure 4 for Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning

Abstract:Frequently updating Large Language Model (LLM)-based recommender systems to adapt to new user interests -- as done for traditional ones -- is impractical due to high training costs, even with acceleration methods. This work explores adapting to dynamic user interests without any model updates by leveraging In-Context Learning (ICL), which allows LLMs to learn new tasks from few-shot examples provided in the input. Using new-interest examples as the ICL few-shot examples, LLMs may learn real-time interest directly, avoiding the need for model updates. However, existing LLM-based recommenders often lose the in-context learning ability during recommendation tuning, while the original LLM's in-context learning lacks recommendation-specific focus. To address this, we propose RecICL, which customizes recommendation-specific in-context learning for real-time recommendations. RecICL organizes training examples in an in-context learning format, ensuring that in-context learning ability is preserved and aligned with the recommendation task during tuning. Extensive experiments demonstrate RecICL's effectiveness in delivering real-time recommendations without requiring model updates. Our code is available at https://github.com/ym689/rec_icl.

Via

Access Paper or Ask Questions

FLOW: A Feedback LOop FrameWork for Simultaneously Enhancing Recommendation and User Agents

Oct 26, 2024

Shihao Cai, Jizhi Zhang, Keqin Bao, Chongming Gao, Fuli Feng

Figure 1 for FLOW: A Feedback LOop FrameWork for Simultaneously Enhancing Recommendation and User Agents

Figure 2 for FLOW: A Feedback LOop FrameWork for Simultaneously Enhancing Recommendation and User Agents

Figure 3 for FLOW: A Feedback LOop FrameWork for Simultaneously Enhancing Recommendation and User Agents

Figure 4 for FLOW: A Feedback LOop FrameWork for Simultaneously Enhancing Recommendation and User Agents

Abstract:Agents powered by large language models have shown remarkable reasoning and execution capabilities, attracting researchers to explore their potential in the recommendation domain. Previous studies have primarily focused on enhancing the capabilities of either recommendation agents or user agents independently, but have not considered the interaction and collaboration between recommendation agents and user agents. To address this gap, we propose a novel framework named FLOW, which achieves collaboration between the recommendation agent and the user agent by introducing a feedback loop. Specifically, the recommendation agent refines its understanding of the user's preferences by analyzing the user agent's feedback on previously suggested items, while the user agent leverages suggested items to uncover deeper insights into the user's latent interests. This iterative refinement process enhances the reasoning capabilities of both the recommendation agent and the user agent, enabling more precise recommendations and a more accurate simulation of user behavior. To demonstrate the effectiveness of the feedback loop, we evaluate both recommendation performance and user simulation performance on three widely used recommendation domain datasets. The experimental results indicate that the feedback loop can simultaneously improve the performance of both the recommendation and user agents.

Via

Access Paper or Ask Questions

Debias Can be Unreliable: Mitigating Bias Issue in Evaluating Debiasing Recommendation

Sep 07, 2024

Chengbing Wang, Wentao Shi, Jizhi Zhang, Wenjie Wang, Hang Pan, Fuli Feng

Abstract:Recent work has improved recommendation models remarkably by equipping them with debiasing methods. Due to the unavailability of fully-exposed datasets, most existing approaches resort to randomly-exposed datasets as a proxy for evaluating debiased models, employing traditional evaluation scheme to represent the recommendation performance. However, in this study, we reveal that traditional evaluation scheme is not suitable for randomly-exposed datasets, leading to inconsistency between the Recall performance obtained using randomly-exposed datasets and that obtained using fully-exposed datasets. Such inconsistency indicates the potential unreliability of experiment conclusions on previous debiasing techniques and calls for unbiased Recall evaluation using randomly-exposed datasets. To bridge the gap, we propose the Unbiased Recall Evaluation (URE) scheme, which adjusts the utilization of randomly-exposed datasets to unbiasedly estimate the true Recall performance on fully-exposed datasets. We provide theoretical evidence to demonstrate the rationality of URE and perform extensive experiments on real-world datasets to validate its soundness.

* 11 pages, 5 figures

Via

Access Paper or Ask Questions

Decoding Matters: Addressing Amplification Bias and Homogeneity Issue for LLM-based Recommendation

Jun 21, 2024

Keqin Bao, Jizhi Zhang, Yang Zhang, Xinyue Huo, Chong Chen, Fuli Feng

Abstract:Adapting Large Language Models (LLMs) for recommendation requires careful consideration of the decoding process, given the inherent differences between generating items and natural language. Existing approaches often directly apply LLMs' original decoding methods. However, we find these methods encounter significant challenges: 1) amplification bias -- where standard length normalization inflates scores for items containing tokens with generation probabilities close to 1 (termed ghost tokens), and 2) homogeneity issue -- generating multiple similar or repetitive items for a user. To tackle these challenges, we introduce a new decoding approach named Debiasing-Diversifying Decoding (D3). D3 disables length normalization for ghost tokens to alleviate amplification bias, and it incorporates a text-free assistant model to encourage tokens less frequently generated by LLMs for counteracting recommendation homogeneity. Extensive experiments on real-world datasets demonstrate the method's effectiveness in enhancing accuracy and diversity.

Via

Access Paper or Ask Questions

GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation

Jun 17, 2024

Shihao Cai, Keqin Bao, Hangyu Guo, Jizhi Zhang, Jun Song, Bo Zheng

Figure 1 for GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation

Figure 2 for GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation

Figure 3 for GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation

Figure 4 for GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation

Abstract:Large language models have seen widespread adoption in math problem-solving. However, in geometry problems that usually require visual aids for better understanding, even the most advanced multi-modal models currently still face challenges in effectively using image information. High-quality data is crucial for enhancing the geometric capabilities of multi-modal models, yet existing open-source datasets and related efforts are either too challenging for direct model learning or suffer from misalignment between text and images. To overcome this issue, we introduce a novel pipeline that leverages GPT-4 and GPT-4V to generate relatively basic geometry problems with aligned text and images, facilitating model learning. We have produced a dataset of 4.9K geometry problems and combined it with 19K open-source data to form our GeoGPT4V dataset. Experimental results demonstrate that the GeoGPT4V dataset significantly improves the geometry performance of various models on the MathVista and MathVision benchmarks. The code is available at https://github.com/Lanyu0303/GeoGPT4V_Project

Via

Access Paper or Ask Questions

Learnable Tokenizer for LLM-based Generative Recommendation

May 12, 2024

Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See-Kiong Ng, Tat-Seng Chua

Abstract:Harnessing Large Language Models (LLMs) for generative recommendation has garnered significant attention due to LLMs' powerful capacities such as rich world knowledge and reasoning. However, a critical challenge lies in transforming recommendation data into the language space of LLMs through effective item tokenization. Existing approaches, such as ID identifiers, textual identifiers, and codebook-based identifiers, exhibit limitations in encoding semantic information, incorporating collaborative signals, or handling code assignment bias. To address these shortcomings, we propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), designed to meet the key criteria of identifiers by integrating hierarchical semantics, collaborative signals, and code assignment diversity. LETTER integrates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias. We instantiate LETTER within two generative recommender models and introduce a ranking-guided generation loss to enhance their ranking ability. Extensive experiments across three datasets demonstrate the superiority of LETTER in item tokenization, thereby advancing the state-of-the-art in the field of generative recommendation.

Via

Access Paper or Ask Questions

Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

May 02, 2024

Tianhao Shi, Yang Zhang, Jizhi Zhang, Fuli Feng, Xiangnan He

Figure 1 for Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

Figure 2 for Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

Figure 3 for Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

Figure 4 for Fair Recommendations with Limited Sensitive Attributes: A Distributionally Robust Optimization Approach

Abstract:As recommender systems are indispensable in various domains such as job searching and e-commerce, providing equitable recommendations to users with different sensitive attributes becomes an imperative requirement. Prior approaches for enhancing fairness in recommender systems presume the availability of all sensitive attributes, which can be difficult to obtain due to privacy concerns or inadequate means of capturing these attributes. In practice, the efficacy of these approaches is limited, pushing us to investigate ways of promoting fairness with limited sensitive attribute information. Toward this goal, it is important to reconstruct missing sensitive attributes. Nevertheless, reconstruction errors are inevitable due to the complexity of real-world sensitive attribute reconstruction problems and legal regulations. Thus, we pursue fair learning methods that are robust to reconstruction errors. To this end, we propose Distributionally Robust Fair Optimization (DRFO), which minimizes the worst-case unfairness over all potential probability distributions of missing sensitive attributes instead of the reconstructed one to account for the impact of the reconstruction errors. We provide theoretical and empirical evidence to demonstrate that our method can effectively ensure fairness in recommender systems when only limited sensitive attributes are accessible.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions