Abstract:The problem of career trajectory prediction (CTP) aims to predict one's future employer or job position. While several CTP methods have been developed for this problem, we posit that none of these methods (1) jointly considers the mutual ternary dependency between three key units (i.e., user, position, and company) of a career and (2) captures the characteristic shifts of key units in career over time, leading to an inaccurate understanding of the job movement patterns in the labor market. To address the above challenges, we propose a novel solution, named as CAPER, that solves the challenges via sophisticated temporal knowledge graph (TKG) modeling. It enables the utilization of a graph-structured knowledge base with rich expressiveness, effectively preserving the changes in job movement patterns. Furthermore, we devise an extrapolated career reasoning task on TKG for a realistic evaluation. The experiments on a real-world career trajectory dataset demonstrate that CAPER consistently and significantly outperforms four baselines, two recent TKG reasoning methods, and five state-of-the-art CTP methods in predicting one's future companies and positions-i.e., on average, yielding 6.80% and 34.58% more accurate predictions, respectively.
Abstract:Graph Neural Networks (GNNs) have become essential tools for graph representation learning in various domains, such as social media and healthcare. However, they often suffer from fairness issues due to inherent biases in node attributes and graph structure, leading to unfair predictions. To address these challenges, we propose a novel GNN framework, DAB-GNN, that Disentangles, Amplifies, and deBiases attribute, structure, and potential biases in the GNN mechanism. DAB-GNN employs a disentanglement and amplification module that isolates and amplifies each type of bias through specialized disentanglers, followed by a debiasing module that minimizes the distance between subgroup distributions to ensure fairness. Extensive experiments on five datasets demonstrate that DAB-GNN significantly outperforms ten state-of-the-art competitors in terms of achieving an optimal balance between accuracy and fairness.
Abstract:Considerable efforts are currently underway to mitigate the negative impacts of echo chambers, such as increased susceptibility to fake news and resistance towards accepting scientific evidence. Prior research has presented the development of computer systems that support the consumption of news information from diverse political perspectives to mitigate the echo chamber effect. However, existing studies still lack the ability to effectively support the key processes of news information consumption and quantitatively identify a political stance towards the information. In this paper, we present HearHere, an AI-based web system designed to help users accommodate information and opinions from diverse perspectives. HearHere facilitates the key processes of news information consumption through two visualizations. Visualization 1 provides political news with quantitative political stance information, derived from our graph-based political classification model, and users can experience diverse perspectives (Hear). Visualization 2 allows users to express their opinions on specific political issues in a comment form and observe the position of their own opinions relative to pro-liberal and pro-conservative comments presented on a map interface (Here). Through a user study with 94 participants, we demonstrate the feasibility of HearHere in supporting the consumption of information from various perspectives. Our findings highlight the importance of providing political stance information and quantifying users' political status as a means to mitigate political polarization. In addition, we propose design implications for system development, including the consideration of demographics such as political interest and providing users with initiatives.
Abstract:The Fair Graph Anomaly Detection (FairGAD) problem aims to accurately detect anomalous nodes in an input graph while ensuring fairness and avoiding biased predictions against individuals from sensitive subgroups such as gender or political leanings. Fairness in graphs is particularly crucial in anomaly detection areas such as misinformation detection in search/ranking systems, where decision outcomes can significantly affect individuals. However, the current literature does not comprehensively discuss this problem, nor does it provide realistic datasets that encompass actual graph structures, anomaly labels, and sensitive attributes for research in FairGAD. To bridge this gap, we introduce a formal definition of the FairGAD problem and present two novel graph datasets constructed from the globally prominent social media platforms Reddit and Twitter. These datasets comprise 1.2 million and 400,000 edges associated with 9,000 and 47,000 nodes, respectively, and leverage political leanings as sensitive attributes and misinformation spreaders as anomaly labels. We demonstrate that our FairGAD datasets significantly differ from the synthetic datasets used currently by the research community. These new datasets offer significant values for FairGAD by providing realistic data that captures the intricacies of social networks. Using our datasets, we investigate the performance-fairness trade-off in eleven existing GAD and non-graph AD methods on five state-of-the-art fairness methods, which sheds light on their effectiveness and limitations in addressing the FairGAD problem.
Abstract:We address the medication recommendation problem, which aims to recommend effective medications for a patient's current visit by utilizing information (e.g., diagnoses and procedures) given at the patient's current and past visits. While there exist a number of recommender systems designed for this problem, we point out that they are challenged in accurately capturing the relation (spec., the degree of relevance) between the current and each of the past visits for the patient when obtaining her current health status, which is the basis for recommending medications. To address this limitation, we propose a novel medication recommendation framework, named VITA, based on the following two novel ideas: (1) relevant-Visit selectIon; (2) Target-aware Attention. Through extensive experiments using real-world datasets, we demonstrate the superiority of VITA (spec., up to 5.56% higher accuracy, in terms of Jaccard, than the best competitor) and the effectiveness of its two core ideas. The code is available at https://github.com/jhheo0123/VITA.
Abstract:In this paper, we focus on multimedia recommender systems using graph convolutional networks (GCNs) where the multimodal features as well as user-item interactions are employed together. Our study aims to exploit multimodal features more effectively in order to accurately capture users' preferences for items. To this end, we point out following two limitations of existing GCN-based multimedia recommender systems: (L1) although multimodal features of interacted items by a user can reveal her preferences on items, existing methods utilize GCN designed to focus only on capturing collaborative signals, resulting in insufficient reflection of the multimodal features in the final user/item embeddings; (L2) although a user decides whether to prefer the target item by considering its multimodal features, existing methods represent her as only a single embedding regardless of the target item's multimodal features and then utilize her embedding to predict her preference for the target item. To address the above issues, we propose a novel multimedia recommender system, named MONET, composed of following two core ideas: modality-embracing GCN (MeGCN) and target-aware attention. Through extensive experiments using four real-world datasets, we demonstrate i) the significant superiority of MONET over seven state-of-the-art competitors (up to 30.32% higher accuracy in terms of recall@20, compared to the best competitor) and ii) the effectiveness of the two core ideas in MONET. All MONET codes are available at https://github.com/Kimyungi/MONET.
Abstract:Personalized news recommendation aims to assist users in finding news articles that align with their interests, which plays a pivotal role in mitigating users' information overload problem. Although many recent works have been studied for better user and news representations, the following challenges have been rarely studied: (C1) How to precisely comprehend a range of intents coupled within a news article? and (C2) How to differentiate news articles with varying post-read preferences in users' click history? To tackle both challenges together, in this paper, we propose a novel personalized news recommendation framework (CIDER) that employs (1) category-guided intent disentanglement for (C1) and (2) consistency-based news representation for (C2). Furthermore, we incorporate a category prediction into the training process of CIDER as an auxiliary task, which provides supplementary supervisory signals to enhance intent disentanglement. Extensive experiments on two real-world datasets reveal that (1) CIDER provides consistent performance improvements over seven state-of-the-art news recommendation methods and (2) the proposed strategies significantly improve the model accuracy of CIDER.
Abstract:Hypergraphs can naturally model group-wise relations (e.g., a group of users who co-purchase an item) as hyperedges. Hyperedge prediction is to predict future or unobserved hyperedges, which is a fundamental task in many real-world applications (e.g., group recommendation). Despite the recent breakthrough of hyperedge prediction methods, the following challenges have been rarely studied: (C1) How to aggregate the nodes in each hyperedge candidate for accurate hyperedge prediction? and (C2) How to mitigate the inherent data sparsity problem in hyperedge prediction? To tackle both challenges together, in this paper, we propose a novel hyperedge prediction framework (CASH) that employs (1) context-aware node aggregation to precisely capture complex relations among nodes in each hyperedge for (C1) and (2) self-supervised contrastive learning in the context of hyperedge prediction to enhance hypergraph representations for (C2). Furthermore, as for (C2), we propose a hyperedge-aware augmentation method to fully exploit the latent semantics behind the original hypergraph and consider both node-level and group-level contrasts (i.e., dual contrasts) for better node and hyperedge representations. Extensive experiments on six real-world hypergraphs reveal that CASH consistently outperforms all competing methods in terms of the accuracy in hyperedge prediction and each of the proposed strategies is effective in improving the model accuracy of CASH. For the detailed information of CASH, we provide the code and datasets at: https://github.com/yy-ko/cash.
Abstract:The problem of representing nodes in a signed network as low-dimensional vectors, known as signed network embedding (SNE), has garnered considerable attention in recent years. While several SNE methods based on graph convolutional networks (GCN) have been proposed for this problem, we point out that they significantly rely on the assumption that the decades-old balance theory always holds in the real-world. To address this limitation, we propose a novel GCN-based SNE approach, named as TrustSGCN, which corrects for incorrect embedding propagation in GCN by utilizing the trustworthiness on edge signs for high-order relationships inferred by the balance theory. The proposed approach consists of three modules: (M1) generation of each node's extended ego-network; (M2) measurement of trustworthiness on edge signs; and (M3) trustworthiness-aware propagation of embeddings. Furthermore, TrustSGCN learns the node embeddings by leveraging two well-known societal theories, i.e., balance and status. The experiments on four real-world signed network datasets demonstrate that TrustSGCN consistently outperforms five state-of-the-art GCN-based SNE methods. The code is available at https://github.com/kmj0792/TrustSGCN.
Abstract:The political stance prediction for news articles has been widely studied to mitigate the echo chamber effect -- people fall into their thoughts and reinforce their pre-existing beliefs. The previous works for the political stance problem focus on (1) identifying political factors that could reflect the political stance of a news article and (2) capturing those factors effectively. Despite their empirical successes, they are not sufficiently justified in terms of how effective their identified factors are in the political stance prediction. Motivated by this, in this work, we conduct a user study to investigate important factors in political stance prediction, and observe that the context and tone of a news article (implicit) and external knowledge for real-world entities appearing in the article (explicit) are important in determining its political stance. Based on this observation, we propose a novel knowledge-aware approach to political stance prediction (KHAN), employing (1) hierarchical attention networks (HAN) to learn the relationships among words and sentences in three different levels and (2) knowledge encoding (KE) to incorporate external knowledge for real-world entities into the process of political stance prediction. Also, to take into account the subtle and important difference between opposite political stances, we build two independent political knowledge graphs (KG) (i.e., KG-lib and KG-con) by ourselves and learn to fuse the different political knowledge. Through extensive evaluations on three real-world datasets, we demonstrate the superiority of DASH in terms of (1) accuracy, (2) efficiency, and (3) effectiveness.