Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaoxiao Xu

Incorporating Group Prior into Variational Inference for Tail-User Behavior Modeling in CTR Prediction

Oct 19, 2024

Han Xu, Taoxing Pan, Zhiqiang Liu, Xiaoxiao Xu, Lantao Hu

Figure 1 for Incorporating Group Prior into Variational Inference for Tail-User Behavior Modeling in CTR Prediction

Figure 2 for Incorporating Group Prior into Variational Inference for Tail-User Behavior Modeling in CTR Prediction

Figure 3 for Incorporating Group Prior into Variational Inference for Tail-User Behavior Modeling in CTR Prediction

Abstract:User behavior modeling -- which aims to extract user interests from behavioral data -- has shown great power in Click-through rate (CTR) prediction, a key component in recommendation systems. Recently, attention-based algorithms have become a promising direction, as attention mechanisms emphasize the relevant interactions from rich behaviors. However, the methods struggle to capture the preferences of tail users with sparse interaction histories. To address the problem, we propose a novel variational inference approach, namely Group Prior Sampler Variational Inference (GPSVI), which introduces group preferences as priors to refine latent user interests for tail users. In GPSVI, the extent of adjustments depends on the estimated uncertainty of individual preference modeling. In addition, We further enhance the expressive power of variational inference by a volume-preserving flow. An appealing property of the GPSVI method is its ability to revert to traditional attention for head users with rich behavioral data while consistently enhancing performance for long-tail users with sparse behaviors. Rigorous analysis and extensive experiments demonstrate that GPSVI consistently improves the performance of tail users. Moreover, online A/B testing on a large-scale real-world recommender system further confirms the effectiveness of our proposed approach.

Via

Access Paper or Ask Questions

Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

Oct 14, 2024

Xiangru Zhu, Penglei Sun, Yaoxian Song, Yanghua Xiao, Zhixu Li, Chengyu Wang, Jun Huang, Bei Yang, Xiaoxiao Xu

Figure 1 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

Figure 2 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

Figure 3 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

Figure 4 for Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

Abstract:Accurate interpretation and visualization of human instructions are crucial for text-to-image (T2I) synthesis. However, current models struggle to capture semantic variations from word order changes, and existing evaluations, relying on indirect metrics like text-image similarity, fail to reliably assess these challenges. This often obscures poor performance on complex or uncommon linguistic patterns by the focus on frequent word combinations. To address these deficiencies, we propose a novel metric called SemVarEffect and a benchmark named SemVarBench, designed to evaluate the causality between semantic variations in inputs and outputs in T2I synthesis. Semantic variations are achieved through two types of linguistic permutations, while avoiding easily predictable literal variations. Experiments reveal that the CogView-3-Plus and Ideogram 2 performed the best, achieving a score of 0.2/1. Semantic variations in object relations are less understood than attributes, scoring 0.07/1 compared to 0.17-0.19/1. We found that cross-modal alignment in UNet or Transformers plays a crucial role in handling semantic variations, a factor previously overlooked by a focus on textual encoders. Our work establishes an effective evaluation framework that advances the T2I synthesis community's exploration of human instruction understanding.

* Our benchmark and code are available at https://github.com/zhuxiangru/SemVarBench

Via

Access Paper or Ask Questions

Learning Social Graph for Inactive User Recommendation

May 08, 2024

Nian Liu, Shen Fan, Ting Bai, Peng Wang, Mingwei Sun, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Chuan Shi

Figure 1 for Learning Social Graph for Inactive User Recommendation

Figure 2 for Learning Social Graph for Inactive User Recommendation

Figure 3 for Learning Social Graph for Inactive User Recommendation

Figure 4 for Learning Social Graph for Inactive User Recommendation

Abstract:Social relations have been widely incorporated into recommender systems to alleviate data sparsity problem. However, raw social relations don't always benefit recommendation due to their inferior quality and insufficient quantity, especially for inactive users, whose interacted items are limited. In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{S}ocial Graph for \textbf{I}nactive User \textbf{R}ecommendation) that learns an optimal social graph structure for social recommendation, especially for inactive users. LSIR recursively aggregates user and item embeddings to collaboratively encode item and user features. Then, graph structure learning (GSL) is employed to refine the raw user-user social graph, by removing noisy edges and adding new edges based on the enhanced embeddings. Meanwhile, mimic learning is implemented to guide active users in mimicking inactive users during model training, which improves the construction of new edges for inactive users. Extensive experiments on real-world datasets demonstrate that LSIR achieves significant improvements of up to 129.58\% on NDCG in inactive user recommendation. Our code is available at~\url{https://github.com/liun-online/LSIR}.

* This paper has been received by DASFAA 2024

Via

Access Paper or Ask Questions

A Model-based Multi-Agent Personalized Short-Video Recommender System

May 03, 2024

Peilun Zhou, Xiaoxiao Xu, Lantao Hu, Han Li, Peng Jiang

Figure 1 for A Model-based Multi-Agent Personalized Short-Video Recommender System

Figure 2 for A Model-based Multi-Agent Personalized Short-Video Recommender System

Figure 3 for A Model-based Multi-Agent Personalized Short-Video Recommender System

Figure 4 for A Model-based Multi-Agent Personalized Short-Video Recommender System

Abstract:Recommender selects and presents top-K items to the user at each online request, and a recommendation session consists of several sequential requests. Formulating a recommendation session as a Markov decision process and solving it by reinforcement learning (RL) framework has attracted increasing attention from both academic and industry communities. In this paper, we propose a RL-based industrial short-video recommender ranking framework, which models and maximizes user watch-time in an environment of user multi-aspect preferences by a collaborative multi-agent formulization. Moreover, our proposed framework adopts a model-based learning approach to alleviate the sample selection bias which is a crucial but intractable problem in industrial recommender system. Extensive offline evaluations and live experiments confirm the effectiveness of our proposed method over alternatives. Our proposed approach has been deployed in our real large-scale short-video sharing platform, successfully serving over hundreds of millions users.

Via

Access Paper or Ask Questions

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Feb 28, 2024

Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Figure 1 for GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Figure 2 for GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Figure 3 for GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Figure 4 for GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Abstract:Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks. While the idea is less explored in the graph domain, despite the availability of numerous powerful graph models (GMs), they are restricted to tasks in a pre-defined form. Although several methods applying LLMs to graphs have been proposed, they fail to simultaneously handle the pre-defined and open-ended tasks, with LLM as a node feature enhancer or as a standalone predictor. To break this dilemma, we propose to bridge the pretrained GM and LLM by a Translator, named GraphTranslator, aiming to leverage GM to handle the pre-defined tasks effectively and utilize the extended interface of LLMs to offer various open-ended tasks for GM. To train such Translator, we propose a Producer capable of constructing the graph-text alignment data along node information, neighbor information and model information. By translating node representation into tokens, GraphTranslator empowers an LLM to make predictions based on language instructions, providing a unified perspective for both pre-defined and open-ended tasks. Extensive results demonstrate the effectiveness of our proposed GraphTranslator on zero-shot node classification. The graph question answering experiments reveal our GraphTranslator potential across a broad spectrum of open-ended tasks through language instructions. Our code is available at: https://github.com/alibaba/GraphTranslator.

Via

Access Paper or Ask Questions

GraphGPT: Graph Learning with Generative Pre-trained Transformers

Dec 31, 2023

Qifang Zhao, Weidong Ren, Tianyu Li, Xiaoxiao Xu, Hong Liu

Abstract:We introduce \textit{GraphGPT}, a novel model for Graph learning by self-supervised Generative Pre-training Transformers. Our model transforms each graph or sampled subgraph into a sequence of tokens representing the node, edge and attributes reversibly using the Eulerian path first. Then we feed the tokens into a standard transformer decoder and pre-train it with the next-token-prediction (NTP) task. Lastly, we fine-tune the GraphGPT model with the supervised tasks. This intuitive, yet effective model achieves superior or close results to the state-of-the-art methods for the graph-, edge- and node-level tasks on the large scale molecular dataset PCQM4Mv2, the protein-protein association dataset ogbl-ppa and the ogbn-proteins dataset from the Open Graph Benchmark (OGB). Furthermore, the generative pre-training enables us to train GraphGPT up to 400M+ parameters with consistently increasing performance, which is beyond the capability of GNNs and previous graph transformers. The source code and pre-trained checkpoints will be released soon\footnote{\url{https://github.com/alibaba/graph-gpt}} to pave the way for the graph foundation model research, and also to assist the scientific discovery in pharmaceutical, chemistry, material and bio-informatics domains, etc.

* 9 pages

Via

Access Paper or Ask Questions

Uniform Sequence Better: Time Interval Aware Data Augmentation for Sequential Recommendation

Dec 16, 2022

Yizhou Dang, Enneng Yang, Guibing Guo, Linying Jiang, Xingwei Wang, Xiaoxiao Xu, Qinghui Sun, Hong Liu

Abstract:Sequential recommendation is an important task to predict the next-item to access based on a sequence of interacted items. Most existing works learn user preference as the transition pattern from the previous item to the next one, ignoring the time interval between these two items. However, we observe that the time interval in a sequence may vary significantly different, and thus result in the ineffectiveness of user modeling due to the issue of \emph{preference drift}. In fact, we conducted an empirical study to validate this observation, and found that a sequence with uniformly distributed time interval (denoted as uniform sequence) is more beneficial for performance improvement than that with greatly varying time interval. Therefore, we propose to augment sequence data from the perspective of time interval, which is not studied in the literature. Specifically, we design five operators (Ti-Crop, Ti-Reorder, Ti-Mask, Ti-Substitute, Ti-Insert) to transform the original non-uniform sequence to uniform sequence with the consideration of variance of time intervals. Then, we devise a control strategy to execute data augmentation on item sequences in different lengths. Finally, we implement these improvements on a state-of-the-art model CoSeRec and validate our approach on four real datasets. The experimental results show that our approach reaches significantly better performance than the other 11 competing methods. Our implementation is available: https://github.com/KingGugu/TiCoSeRec.

* 9 pages, 4 figures, AAAI-2023

Via

Access Paper or Ask Questions

Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction

Apr 29, 2022

Xiaoxiao Xu, Zhiwei Fang, Qian Yu, Ruoran Huang, \\Chaosheng Fan, Yong Li, Yang He, Changping Peng, Zhangang Lin, Jingping Shao

Figure 1 for Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction

Figure 2 for Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction

Figure 3 for Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction

Figure 4 for Gating-adapted Wavelet Multiresolution Analysis for Exposure Sequence Modeling in CTR prediction

Abstract:The exposure sequence is being actively studied for user interest modeling in Click-Through Rate (CTR) prediction. However, the existing methods for exposure sequence modeling bring extensive computational burden and neglect noise problems, resulting in an excessively latency and the limited performance in online recommenders. In this paper, we propose to address the high latency and noise problems via Gating-adapted wavelet multiresolution analysis (Gama), which can effectively denoise the extremely long exposure sequence and adaptively capture the implied multi-dimension user interest with linear computational complexity. This is the first attempt to integrate non-parametric multiresolution analysis technique into deep neural networks to model user exposure sequence. Extensive experiments on large scale benchmark dataset and real production dataset confirm the effectiveness of Gama for exposure sequence modeling, especially in cold-start scenarios. Benefited from its low latency and high effecitveness, Gama has been deployed in our real large-scale industrial recommender, successfully serving over hundreds of millions users.

* In proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22), July 11--15, 2022, Madrid, Spain. 5 pages

Via

Access Paper or Ask Questions

Alleviating Cold-start Problem in CTR Prediction with A Variational Embedding Learning Framework

Jan 17, 2022

Xiaoxiao Xu, Chen Yang, Qian Yu, Zhiwei Fang, Jiaxing Wang, Chaosheng Fan, Yang He, Changping Peng, Zhangang Lin, Jingping Shao

Figure 1 for Alleviating Cold-start Problem in CTR Prediction with A Variational Embedding Learning Framework

Figure 2 for Alleviating Cold-start Problem in CTR Prediction with A Variational Embedding Learning Framework

Figure 3 for Alleviating Cold-start Problem in CTR Prediction with A Variational Embedding Learning Framework

Figure 4 for Alleviating Cold-start Problem in CTR Prediction with A Variational Embedding Learning Framework

Abstract:We propose a general Variational Embedding Learning Framework (VELF) for alleviating the severe cold-start problem in CTR prediction. VELF addresses the cold start problem via alleviating over-fits caused by data-sparsity in two ways: learning probabilistic embedding, and incorporating trainable and regularized priors which utilize the rich side information of cold start users and advertisements (Ads). The two techniques are naturally integrated into a variational inference framework, forming an end-to-end training process. Abundant empirical tests on benchmark datasets well demonstrate the advantages of our proposed VELF. Besides, extended experiments confirmed that our parameterized and regularized priors provide more generalization capability than traditional fixed priors.

* In Proceedings of the Web Conference 2022 (WWW 2022), April 25-29, 2022, Lyon, France. 9 pages

Via

Access Paper or Ask Questions

Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling

Oct 25, 2021

Bei Yang, Ke Liu, Xiaoxiao Xu, Renjun Xu, Hong Liu, Huan Xu

Figure 1 for Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling

Figure 2 for Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling

Figure 3 for Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling

Figure 4 for Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling

Abstract:Universal user representation is an important research topic in industry, and is widely used in diverse downstream user analysis tasks, such as user profiling and user preference prediction. With the rapid development of Internet service platforms, extremely long user behavior sequences have been accumulated. However, existing researches have little ability to model universal user representation based on lifelong sequences of user behavior since registration. In this study, we propose a novel framework called Lifelong User Representation Model (LURM) to tackle this challenge. Specifically, LURM consists of two cascaded sub-models: (i) Bag of Interests (BoI) encodes user behaviors in any time period into a sparse vector with super-high dimension (e.g.,105); (ii) Self-supervised Multi-anchor EncoderNetwork (SMEN) maps sequences of BoI features to multiple low-dimensional user representations by contrastive learning. SMEN achieves almost lossless dimensionality reduction, benefiting from a novel multi-anchor module which can learn different aspects of user preferences. Experiments on several benchmark datasets show that our approach outperforms state-of-the-art unsupervised representation methods in downstream tasks

* during peer review

Via

Access Paper or Ask Questions