Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Weixin Chen

Leave No One Behind: Fairness-Aware Cross-Domain Recommender Systems for Non-Overlapping Users

Jul 23, 2025

Weixin Chen, Yuhan Zhao, Li Chen, Weike Pan

Abstract:Cross-domain recommendation (CDR) methods predominantly leverage overlapping users to transfer knowledge from a source domain to a target domain. However, through empirical studies, we uncover a critical bias inherent in these approaches: while overlapping users experience significant enhancements in recommendation quality, non-overlapping users benefit minimally and even face performance degradation. This unfairness may erode user trust, and, consequently, negatively impact business engagement and revenue. To address this issue, we propose a novel solution that generates virtual source-domain users for non-overlapping target-domain users. Our method utilizes a dual attention mechanism to discern similarities between overlapping and non-overlapping users, thereby synthesizing realistic virtual user embeddings. We further introduce a limiter component that ensures the generated virtual users align with real-data distributions while preserving each user's unique characteristics. Notably, our method is model-agnostic and can be seamlessly integrated into any CDR model. Comprehensive experiments conducted on three public datasets with five CDR baselines demonstrate that our method effectively mitigates the CDR non-overlapping user bias, without loss of overall accuracy. Our code is publicly available at https://github.com/WeixinChen98/VUG.

* Accepted by RecSys 2025

Via

Access Paper or Ask Questions

Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning

Jan 13, 2025

Weixin Chen, Simon Yu, Huajie Shao, Lui Sha, Han Zhao

Figure 1 for Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning

Figure 2 for Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning

Figure 3 for Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning

Figure 4 for Neural Probabilistic Circuits: Enabling Compositional and Interpretable Predictions through Logical Reasoning

Abstract:End-to-end deep neural networks have achieved remarkable success across various domains but are often criticized for their lack of interpretability. While post hoc explanation methods attempt to address this issue, they often fail to accurately represent these black-box models, resulting in misleading or incomplete explanations. To overcome these challenges, we propose an inherently transparent model architecture called Neural Probabilistic Circuits (NPCs), which enable compositional and interpretable predictions through logical reasoning. In particular, an NPC consists of two modules: an attribute recognition model, which predicts probabilities for various attributes, and a task predictor built on a probabilistic circuit, which enables logical reasoning over recognized attributes to make class predictions. To train NPCs, we introduce a three-stage training algorithm comprising attribute recognition, circuit construction, and joint optimization. Moreover, we theoretically demonstrate that an NPC's error is upper-bounded by a linear combination of the errors from its modules. To further demonstrate the interpretability of NPC, we provide both the most probable explanations and the counterfactual explanations. Empirical results on four benchmark datasets show that NPCs strike a balance between interpretability and performance, achieving results competitive even with those of end-to-end black-box models while providing enhanced interpretability.

Via

Access Paper or Ask Questions

Political-LLM: Large Language Models in Political Science

Dec 09, 2024

Lincan Li, Jiaqi Li, Catherine Chen, Fred Gui, Hongjia Yang, Chenxiao Yu, Zhengguang Wang, Jianing Cai, Junlong Aaron Zhou, Bolin Shen(+37 more)

Figure 1 for Political-LLM: Large Language Models in Political Science

Figure 2 for Political-LLM: Large Language Models in Political Science

Figure 3 for Political-LLM: Large Language Models in Political Science

Figure 4 for Political-LLM: Large Language Models in Political Science

Abstract:In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer science and political science--present the first principled framework termed Political-LLM to advance the comprehensive understanding of integrating LLMs into computational political science. Specifically, we first introduce a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies. In particular, from the political science perspective, we highlight the role of LLMs in automating predictive and generative tasks, simulating behavior dynamics, and improving causal inference through tools like counterfactual generation; from a computational perspective, we introduce advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts. We identify key challenges and future directions, emphasizing the development of domain-specific datasets, addressing issues of bias and fairness, incorporating human expertise, and redefining evaluation criteria to align with the unique requirements of computational political science. Political-LLM seeks to serve as a guidebook for researchers to foster an informed, ethical, and impactful use of Artificial Intelligence in political science. Our online resource is available at: http://political-llm.org/.

* 54 Pages, 9 Figures

Via

Access Paper or Ask Questions

Matryoshka Representation Learning for Recommendation

Jun 11, 2024

Riwei Lai, Li Chen, Weixin Chen, Rui Chen

Figure 1 for Matryoshka Representation Learning for Recommendation

Figure 2 for Matryoshka Representation Learning for Recommendation

Figure 3 for Matryoshka Representation Learning for Recommendation

Figure 4 for Matryoshka Representation Learning for Recommendation

Abstract:Representation learning is essential for deep-neural-network-based recommender systems to capture user preferences and item features within fixed-dimensional user and item vectors. Unlike existing representation learning methods that either treat each user preference and item feature uniformly or categorize them into discrete clusters, we argue that in the real world, user preferences and item features are naturally expressed and organized in a hierarchical manner, leading to a new direction for representation learning. In this paper, we introduce a novel matryoshka representation learning method for recommendation (MRL4Rec), by which we restructure user and item vectors into matryoshka representations with incrementally dimensional and overlapping vector spaces to explicitly represent user preferences and item features at different hierarchical levels. We theoretically establish that constructing training triplets specific to each level is pivotal in guaranteeing accurate matryoshka representation learning. Subsequently, we propose the matryoshka negative sampling mechanism to construct training triplets, which further ensures the effectiveness of the matryoshka representation learning in capturing hierarchical user preferences and item features. The experiments demonstrate that MRL4Rec can consistently and substantially outperform a number of state-of-the-art competitors on several real-life datasets. Our code is publicly available at https://github.com/Riwei-HEU/MRL.

Via

Access Paper or Ask Questions

GRATH: Gradual Self-Truthifying for Large Language Models

Jan 31, 2024

Weixin Chen, Dawn Song, Bo Li

Abstract:Truthfulness is paramount for large language models (LLMs) as they are increasingly deployed in real-world applications. However, existing LLMs still struggle with generating truthful content, as evidenced by their modest performance on benchmarks like TruthfulQA. To address this issue, we propose GRAdual self-truTHifying (GRATH), a novel post-processing method to enhance truthfulness of LLMs. GRATH utilizes out-of-domain question prompts to generate pairwise truthfulness training data with each pair containing a question and its correct and incorrect answers, and then optimizes the model via direct preference optimization (DPO) to learn from the truthfulness difference between answer pairs. GRATH iteratively refines truthfulness data and updates the model, leading to a gradual improvement in model truthfulness in a self-supervised manner. Empirically, we evaluate GRATH using different 7B-LLMs and compare with LLMs with similar or even larger sizes on benchmark datasets. Our results show that GRATH effectively improves LLMs' truthfulness without compromising other core capabilities. Notably, GRATH achieves state-of-the-art performance on TruthfulQA, with MC1 accuracy of 54.71% and MC2 accuracy of 69.10%, which even surpass those on 70B-LLMs.

Via

Access Paper or Ask Questions

FMMRec: Fairness-aware Multimodal Recommendation

Oct 26, 2023

Weixin Chen, Li Chen, Yongxin Ni, Yuhan Zhao, Fajie Yuan, Yongfeng Zhang

Figure 1 for FMMRec: Fairness-aware Multimodal Recommendation

Figure 2 for FMMRec: Fairness-aware Multimodal Recommendation

Figure 3 for FMMRec: Fairness-aware Multimodal Recommendation

Figure 4 for FMMRec: Fairness-aware Multimodal Recommendation

Abstract:Recently, multimodal recommendations have gained increasing attention for effectively addressing the data sparsity problem by incorporating modality-based representations. Although multimodal recommendations excel in accuracy, the introduction of different modalities (e.g., images, text, and audio) may expose more users' sensitive information (e.g., gender and age) to recommender systems, resulting in potentially more serious unfairness issues. Despite many efforts on fairness, existing fairness-aware methods are either incompatible with multimodal scenarios, or lead to suboptimal fairness performance due to neglecting sensitive information of multimodal content. To achieve counterfactual fairness in multimodal recommendations, we propose a novel fairness-aware multimodal recommendation approach (dubbed as FMMRec) to disentangle the sensitive and non-sensitive information from modal representations and leverage the disentangled modal representations to guide fairer representation learning. Specifically, we first disentangle biased and filtered modal representations by maximizing and minimizing their sensitive attribute prediction ability respectively. With the disentangled modal representations, we mine the modality-based unfair and fair (corresponding to biased and filtered) user-user structures for enhancing explicit user representation with the biased and filtered neighbors from the corresponding structures, followed by adversarially filtering out sensitive information. Experiments on two real-world public datasets demonstrate the superiority of our FMMRec relative to the state-of-the-art baselines. Our source code is available at https://anonymous.4open.science/r/FMMRec.

Via

Access Paper or Ask Questions

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Jun 20, 2023

Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer(+9 more)

Figure 1 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Figure 2 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Figure 3 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Figure 4 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Abstract:Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications to healthcare and finance - where mistakes can be costly. To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives - including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. Based on our evaluations, we discover previously unpublished vulnerabilities to trustworthiness threats. For instance, we find that GPT models can be easily misled to generate toxic and biased outputs and leak private information in both training data and conversation history. We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, potentially due to the reason that GPT-4 follows the (misleading) instructions more precisely. Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps. Our benchmark is publicly available at https://decodingtrust.github.io/.

Via

Access Paper or Ask Questions

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

Mar 10, 2023

Weixin Chen, Dawn Song, Bo Li

Abstract:Diffusion models have achieved great success in a range of tasks, such as image synthesis and molecule design. As such successes hinge on large-scale training data collected from diverse sources, the trustworthiness of these collected data is hard to control or audit. In this work, we aim to explore the vulnerabilities of diffusion models under potential training data manipulations and try to answer: How hard is it to perform Trojan attacks on well-trained diffusion models? What are the adversarial targets that such Trojan attacks can achieve? To answer these questions, we propose an effective Trojan attack against diffusion models, TrojDiff, which optimizes the Trojan diffusion and generative processes during training. In particular, we design novel transitions during the Trojan diffusion process to diffuse adversarial targets into a biased Gaussian distribution and propose a new parameterization of the Trojan generative process that leads to an effective training objective for the attack. In addition, we consider three types of adversarial targets: the Trojaned diffusion models will always output instances belonging to a certain class from the in-domain distribution (In-D2D attack), out-of-domain distribution (Out-D2D-attack), and one specific instance (D2I attack). We evaluate TrojDiff on CIFAR-10 and CelebA datasets against both DDPM and DDIM diffusion models. We show that TrojDiff always achieves high attack performance under different adversarial targets using different types of triggers, while the performance in benign environments is preserved. The code is available at https://github.com/chenweixin107/TrojDiff.

* CVPR2023

Via

Access Paper or Ask Questions

An Analytical Range-Angle Dependent Beam Focusing Model for Terahertz Linear Antenna Array

May 22, 2022

Lingxiang Li, Haoran Li, Zhi Chen, Weixin Chen, Shaoqian Li

Figure 1 for An Analytical Range-Angle Dependent Beam Focusing Model for Terahertz Linear Antenna Array

Figure 2 for An Analytical Range-Angle Dependent Beam Focusing Model for Terahertz Linear Antenna Array

Figure 3 for An Analytical Range-Angle Dependent Beam Focusing Model for Terahertz Linear Antenna Array

Figure 4 for An Analytical Range-Angle Dependent Beam Focusing Model for Terahertz Linear Antenna Array

Abstract:This paper considers a scenario in which the Terahertz (THz) transmitter equipped with a linear antenna array wishes to focus its beam to a desired spatial region in the array near-field. The goal is to compute the achievable spatial region and determine how the system parameters such as the carrier frequency, the array dimension and the user's location affect its beam focusing performance. First, based on a theorem from analytic geometry, we show that the achievable focusing spatial region constitutes a rotated ellipse, with the x and y coordinates denoting the range and angle, respectively. In this way, the determination of the spatial region is reduced to a problem of deriving the coverage of an ellipse. The achievable coverage is then obtained in closed form, and the construction of carrier frequency offsets that can analytically control the beam focusing performance is provided. Numerical results validate the theoretical findings and demonstrate the performance of the proposed method.

* 14 pages,3 figures

Via

Access Paper or Ask Questions

Enabling the Network to Surf the Internet

Feb 24, 2021

Zhuoling Li, Haohan Wang, Tymoteusz Swistek, Weixin Chen, Yuanzheng Li, Haoqian Wang

Figure 1 for Enabling the Network to Surf the Internet

Figure 2 for Enabling the Network to Surf the Internet

Figure 3 for Enabling the Network to Surf the Internet

Figure 4 for Enabling the Network to Surf the Internet

Abstract:Few-shot learning is challenging due to the limited data and labels. Existing algorithms usually resolve this problem by pre-training the model with a considerable amount of annotated data which shares knowledge with the target domain. Nevertheless, large quantities of homogenous data samples are not always available. To tackle this issue, we develop a framework that enables the model to surf the Internet, which implies that the model can collect and annotate data without manual effort. Since the online data is virtually limitless and continues to be generated, the model can thus be empowered to constantly obtain up-to-date knowledge from the Internet. Additionally, we observe that the generalization ability of the learned representation is crucial for self-supervised learning. To present its importance, a naive yet efficient normalization strategy is proposed. Consequentially, this strategy boosts the accuracy of the model significantly (20.46% at most). We demonstrate the superiority of the proposed framework with experiments on miniImageNet, tieredImageNet and Omniglot. The results indicate that our method has surpassed previous unsupervised counterparts by a large margin (more than 10%) and obtained performance comparable with the supervised ones.

Via

Access Paper or Ask Questions