Abstract:Federated Learning (FL) is an innovative distributed machine learning paradigm that enables neural network training across devices without centralizing data. While this addresses issues of information sharing and data privacy, challenges arise from data heterogeneity across clients and increasing network scale, leading to impacts on model performance and training efficiency. Previous research shows that in IID environments, the parameter structure of the model is expected to adhere to certain specific consistency principles. Thus, identifying and regularizing these consistencies can mitigate issues from heterogeneous data. We found that both soft labels derived from knowledge distillation and the classifier head parameter matrix, when multiplied by their own transpose, capture the intrinsic relationships between data classes. These shared relationships suggest inherent consistency. Therefore, the work in this paper identifies the consistency between the two and leverages it to regulate training, underpinning our proposed FedDW framework. Experimental results show FedDW outperforms 10 state-of-the-art FL methods, improving accuracy by an average of 3% in highly heterogeneous settings. Additionally, we provide a theoretical proof that FedDW offers higher efficiency, with the additional computational load from backpropagation being negligible. The code is available at https://github.com/liuvvvvv1/FedDW.
Abstract:In recent years, bundle recommendation systems have gained significant attention in both academia and industry due to their ability to enhance user experience and increase sales by recommending a set of items as a bundle rather than individual items. This survey provides a comprehensive review on bundle recommendation, beginning by a taxonomy for exploring product bundling. We classify it into two categories based on bundling strategy from various application domains, i.e., discriminative and generative bundle recommendation. Then we formulate the corresponding tasks of the two categories and systematically review their methods: 1) representation learning from bundle and item levels and interaction modeling for discriminative bundle recommendation; 2) representation learning from item level and bundle generation for generative bundle recommendation. Subsequently, we survey the resources of bundle recommendation including datasets and evaluation metrics, and conduct reproducibility experiments on mainstream models. Lastly, we discuss the main challenges and highlight the promising future directions in the field of bundle recommendation, aiming to serve as a useful resource for researchers and practitioners. Our code and datasets are publicly available at https://github.com/WUT-IDEA/bundle-recommendation-survey.
Abstract:One-shot federated learning (FL) limits the communication between the server and clients to a single round, which largely decreases the privacy leakage risks in traditional FLs requiring multiple communications. However, we find existing one-shot FL frameworks are vulnerable to distributional heterogeneity due to their insufficient focus on data heterogeneity while concentrating predominantly on model heterogeneity. Filling this gap, we propose a unified, data-free, one-shot federated learning framework (FedHydra) that can effectively address both model and data heterogeneity. Rather than applying existing value-only learning mechanisms, a structure-value learning mechanism is proposed in FedHydra. Specifically, a new stratified learning structure is proposed to cover data heterogeneity, and the value of each item during computation reflects model heterogeneity. By this design, the data and model heterogeneity issues are simultaneously monitored from different aspects during learning. Consequently, FedHydra can effectively mitigate both issues by minimizing their inherent conflicts. We compared FedHydra with three SOTA baselines on four benchmark datasets. Experimental results show that our method outperforms the previous one-shot FL methods in both homogeneous and heterogeneous settings.
Abstract:Aspect-based sentiment analysis (ABSA) involves identifying sentiment towards specific aspect terms in a sentence and allows us to uncover nuanced perspectives and attitudes on particular aspects of a product, service, or topic. However, the scarcity of labeled data poses a significant challenge to training high-quality models. To address this issue, we explore the potential of data augmentation using ChatGPT, a well-performing large language model (LLM), to enhance the sentiment classification performance towards aspect terms. Specifically, we explore three data augmentation strategies based on ChatGPT: context-focused, aspect-focused, and context-aspect data augmentation techniques. Context-focused data augmentation focuses on changing the word expression of context words in the sentence while keeping aspect terms unchanged. In contrast, aspect-focused data augmentation aims to change aspect terms but keep context words unchanged. Context-Aspect data augmentation integrates the above two data augmentations to generate augmented samples. Furthermore, we incorporate contrastive learning into the ABSA tasks to improve performance. Extensive experiments show that all three data augmentation techniques lead to performance improvements, with the context-aspect data augmentation strategy performing best and surpassing the performance of the baseline models.
Abstract:Suicide is a prominent issue in society. Unfortunately, many people at risk for suicide do not receive the support required. Barriers to people receiving support include social stigma and lack of access to mental health care. With the popularity of social media, people have turned to online forums, such as Reddit to express their feelings and seek support. This provides the opportunity to support people with the aid of artificial intelligence. Social media posts can be classified, using text classification, to help connect people with professional help. However, these systems fail to account for the inherent uncertainty in classifying mental health conditions. Unlike other areas of healthcare, mental health conditions have no objective measurements of disease often relying on expert opinion. Thus when formulating deep learning problems involving mental health, using hard, binary labels does not accurately represent the true nature of the data. In these settings, where human experts may disagree, fuzzy or soft labels may be more appropriate. The current work introduces a novel label smoothing method which we use to capture any uncertainty within the data. We test our approach on a five-label multi-class classification problem. We show, our semi-supervised deep label smoothing method improves classification accuracy above the existing state of the art. Where existing research reports an accuracy of 43\% on the Reddit C-SSRS dataset, using empirical experiments to evaluate our novel label smoothing method, we improve upon this existing benchmark to 52\%. These improvements in model performance have the potential to better support those experiencing mental distress. Future work should explore the use of probabilistic methods in both natural language processing and quantifying contributions of both epistemic and aleatoric uncertainty in noisy datasets.
Abstract:Repetitive Transcranial Magnetic Stimulation (rTMS) is a well-supported, evidence-based treatment for depression. However, patterns of response to this treatment are inconsistent. Emerging evidence suggests that artificial intelligence can predict rTMS treatment outcomes for most patients using fMRI connectivity features. While these models can reliably predict treatment outcomes for many patients for some underrepresented fMRI connectivity measures DNN models are unable to reliably predict treatment outcomes. As such we propose a novel method, Diversity Enhancing Conditional General Adversarial Network (DE-CGAN) for oversampling these underrepresented examples. DE-CGAN creates synthetic examples in difficult-to-classify regions by first identifying these data points and then creating conditioned synthetic examples to enhance data diversity. Through empirical experiments we show that a classification model trained using a diversity enhanced training set outperforms traditional data augmentation techniques and existing benchmark results. This work shows that increasing the diversity of a training dataset can improve classification model performance. Furthermore, this work provides evidence for the utility of synthetic patients providing larger more robust datasets for both AI researchers and psychiatrists to explore variable relationships.
Abstract:Granularity and accuracy are two crucial factors for crime event prediction. Within fine-grained event classification, multiple criminal intents may alternately exhibit in preceding sequential events, and progress differently in next. Such intensive intent dynamics makes training models hard to capture unobserved intents, and thus leads to sub-optimal generalization performance, especially in the intertwining of numerous potential events. To capture comprehensive criminal intents, this paper proposes a fine-grained sequential crime prediction framework, CrimeAlarm, that equips with a novel mutual distillation strategy inspired by curriculum learning. During the early training phase, spot-shared criminal intents are captured through high-confidence sequence samples. In the later phase, spot-specific intents are gradually learned by increasing the contribution of low-confidence sequences. Meanwhile, the output probability distributions are reciprocally learned between prediction networks to model unobserved criminal intents. Extensive experiments show that CrimeAlarm outperforms state-of-the-art methods in terms of NDCG@5, with improvements of 4.51% for the NYC16 and 7.73% for the CHI18 in accuracy measures.
Abstract:Meal recommendation, as a typical health-related recommendation task, contains complex relationships between users, courses, and meals. Among them, meal-course affiliation associates user-meal and user-course interactions. However, an extensive literature review demonstrates that there is a lack of publicly available meal recommendation datasets including meal-course affiliation. Meal recommendation research has been constrained in exploring the impact of cooperation between two levels of interaction on personalization and healthiness. To pave the way for meal recommendation research, we introduce a new benchmark dataset called MealRec$^+$. Due to constraints related to user health privacy and meal scenario characteristics, the collection of data that includes both meal-course affiliation and two levels of interactions is impeded. Therefore, a simulation method is adopted to derive meal-course affiliation and user-meal interaction from the user's dining sessions simulated based on user-course interaction data. Then, two well-known nutritional standards are used to calculate the healthiness scores of meals. Moreover, we experiment with several baseline models, including separate and cooperative interaction learning methods. Our experiment demonstrates that cooperating the two levels of interaction in appropriate ways is beneficial for meal recommendations. Furthermore, in response to the less healthy recommendation phenomenon found in the experiment, we explore methods to enhance the healthiness of meal recommendations. The dataset is available on GitHub (https://github.com/WUT-IDEA/MealRecPlus).
Abstract:Happiness computing based on large-scale online web data and machine learning methods is an emerging research topic that underpins a range of issues, from personal growth to social stability. Many advanced Machine Learning (ML) models with explanations are used to compute the happiness online assessment while maintaining high accuracy of results. However, domain knowledge constraints, such as the primary and secondary relations of happiness factors, are absent from these models, which limits the association between computing results and the right reasons for why they occurred. This article attempts to provide new insights into the explanation consistency from an empirical study perspective. Then we study how to represent and introduce domain knowledge constraints to make ML models more trustworthy. We achieve this through: (1) proving that multiple prediction models with additive factor attributions will have the desirable property of primary and secondary relations consistency, and (2) showing that factor relations with quantity can be represented as an importance distribution for encoding domain knowledge. Factor explanation difference is penalized by the Kullback-Leibler divergence-based loss among computing models. Experimental results using two online web datasets show that domain knowledge of stable factor relations exists. Using this knowledge not only improves happiness computing accuracy but also reveals more significative happiness factors for assisting decisions well.
Abstract:The exploration of Graph Neural Networks (GNNs) for processing graph-structured data has expanded, particularly their potential for causal analysis due to their universal approximation capabilities. Anticipated to significantly enhance common graph-based tasks such as classification and prediction, the development of a causally enhanced GNN framework is yet to be thoroughly investigated. Addressing this shortfall, our study delves into nine benchmark graph classification models, testing their strength and versatility across seven datasets spanning three varied domains to discern the impact of causality on the predictive prowess of GNNs. This research offers a detailed assessment of these models, shedding light on their efficiency, and flexibility in different data environments, and highlighting areas needing advancement. Our findings are instrumental in furthering the understanding and practical application of GNNs in diverse datacentric fields