Abstract:This paper introduces a novel framework for unified incremental few-shot object detection (iFSOD) and instance segmentation (iFSIS) using the Transformer architecture. Our goal is to create an optimal solution for situations where only a few examples of novel object classes are available, with no access to training data for base or old classes, while maintaining high performance across both base and novel classes. To achieve this, We extend Mask-DINO into a two-stage incremental learning framework. Stage 1 focuses on optimizing the model using the base dataset, while Stage 2 involves fine-tuning the model on novel classes. Besides, we incorporate a classifier selection strategy that assigns appropriate classifiers to the encoder and decoder according to their distinct functions. Empirical evidence indicates that this approach effectively mitigates the over-fitting on novel classes learning. Furthermore, we implement knowledge distillation to prevent catastrophic forgetting of base classes. Comprehensive evaluations on the COCO and LVIS datasets for both iFSIS and iFSOD tasks demonstrate that our method significantly outperforms state-of-the-art approaches.
Abstract:In recommender systems, various latent confounding factors (e.g., user social environment and item public attractiveness) can affect user behavior, item exposure, and feedback in distinct ways. These factors may directly or indirectly impact user feedback and are often shared across items or users, making them multi-cause latent confounders. However, existing methods typically fail to account for latent confounders between users and their feedback, as well as those between items and user feedback simultaneously. To address the problem of multi-cause latent confounders, we propose a multi-cause deconfounding method for recommender systems with latent confounders (MCDCF). MCDCF leverages multi-cause causal effect estimation to learn substitutes for latent confounders associated with both users and items, using user behaviour data. Specifically, MCDCF treats the multiple items that users interact with and the multiple users that interact with items as treatment variables, enabling it to learn substitutes for the latent confounders that influence the estimation of causality between users and their feedback, as well as between items and user feedback. Additionally, we theoretically demonstrate the soundness of our MCDCF method. Extensive experiments on three real-world datasets demonstrate that our MCDCF method effectively recovers latent confounders related to users and items, reducing bias and thereby improving recommendation accuracy.
Abstract:With the widespread use of Graph Neural Networks (GNNs) for representation learning from network data, the fairness of GNN models has raised great attention lately. Fair GNNs aim to ensure that node representations can be accurately classified, but not easily associated with a specific group. Existing advanced approaches essentially enhance the generalisation of node representation in combination with data augmentation strategy, and do not directly impose constraints on the fairness of GNNs. In this work, we identify that a fundamental reason for the unfairness of GNNs in social network learning is the phenomenon of social homophily, i.e., users in the same group are more inclined to congregate. The message-passing mechanism of GNNs can cause users in the same group to have similar representations due to social homophily, leading model predictions to establish spurious correlations with sensitive attributes. Inspired by this reason, we propose a method called Equity-Aware GNN (EAGNN) towards fair graph representation learning. Specifically, to ensure that model predictions are independent of sensitive attributes while maintaining prediction performance, we introduce constraints for fair representation learning based on three principles: sufficiency, independence, and separation. We theoretically demonstrate that our EAGNN method can effectively achieve group fairness. Extensive experiments on three datasets with varying levels of social homophily illustrate that our EAGNN method achieves the state-of-the-art performance across two fairness metrics and offers competitive effectiveness.
Abstract:Quantum federated learning has brought about the improvement of privacy image classification, while the lack of personality of the client model may contribute to the suboptimal of quantum federated learning. A personalized quantum federated learning algorithm for privacy image classification is proposed to enhance the personality of the client model in the case of an imbalanced distribution of images. First, a personalized quantum federated learning model is constructed, in which a personalized layer is set for the client model to maintain the personalized parameters. Second, a personalized quantum federated learning algorithm is introduced to secure the information exchanged between the client and server.Third, the personalized federated learning is applied to image classification on the FashionMNIST dataset, and the experimental results indicate that the personalized quantum federated learning algorithm can obtain global and local models with excellent performance, even in situations where local training samples are imbalanced. The server's accuracy is 100% with 8 clients and a distribution parameter of 100, outperforming the non-personalized model by 7%. The average client accuracy is 2.9% higher than that of the non-personalized model with 2 clients and a distribution parameter of 1. Compared to previous quantum federated learning algorithms, the proposed personalized quantum federated learning algorithm eliminates the need for additional local training while safeguarding both model and data privacy.It may facilitate broader adoption and application of quantum technologies, and pave the way for more secure, scalable, and efficient quantum distribute machine learning solutions.
Abstract:The rapid development of Large Language Models (LLMs) creates new opportunities for recommender systems, especially by exploiting the side information (e.g., descriptions and analyses of items) generated by these models. However, aligning this side information with collaborative information from historical interactions poses significant challenges. The inherent biases within LLMs can skew recommendations, resulting in distorted and potentially unfair user experiences. On the other hand, propensity bias causes side information to be aligned in such a way that it often tends to represent all inputs in a low-dimensional subspace, leading to a phenomenon known as dimensional collapse, which severely restricts the recommender system's ability to capture user preferences and behaviours. To address these issues, we introduce a novel framework named Counterfactual LLM Recommendation (CLLMR). Specifically, we propose a spectrum-based side information encoder that implicitly embeds structural information from historical interactions into the side information representation, thereby circumventing the risk of dimension collapse. Furthermore, our CLLMR approach explores the causal relationships inherent in LLM-based recommender systems. By leveraging counterfactual inference, we counteract the biases introduced by LLMs. Extensive experiments demonstrate that our CLLMR approach consistently enhances the performance of various recommender models.
Abstract:In recommender systems, popularity and conformity biases undermine recommender effectiveness by disproportionately favouring popular items, leading to their over-representation in recommendation lists and causing an unbalanced distribution of user-item historical data. We construct a causal graph to address both biases and describe the abstract data generation mechanism. Then, we use it as a guide to develop a novel Debiased Contrastive Learning framework for Mitigating Dual Biases, called DCLMDB. In DCLMDB, both popularity bias and conformity bias are handled in the model training process by contrastive learning to ensure that user choices and recommended items are not unduly influenced by conformity and popularity. Extensive experiments on two real-world datasets, Movielens-10M and Netflix, show that DCLMDB can effectively reduce the dual biases, as well as significantly enhance the accuracy and diversity of recommendations.
Abstract:Graph unlearning technology has become increasingly important since the advent of the `right to be forgotten' and the growing concerns about the privacy and security of artificial intelligence. Graph unlearning aims to quickly eliminate the effects of specific data on graph neural networks (GNNs). However, most existing deterministic graph unlearning frameworks follow a balanced partition-submodel training-aggregation paradigm, resulting in a lack of structural information between subgraph neighborhoods and redundant unlearning parameter calculations. To address this issue, we propose a novel Graph Structure Mapping Unlearning paradigm (GSMU) and a novel method based on it named Community-centric Graph Eraser (CGE). CGE maps community subgraphs to nodes, thereby enabling the reconstruction of a node-level unlearning operation within a reduced mapped graph. CGE makes the exponential reduction of both the amount of training data and the number of unlearning parameters. Extensive experiments conducted on five real-world datasets and three widely used GNN backbones have verified the high performance and efficiency of our CGE method, highlighting its potential in the field of graph unlearning.
Abstract:In recommender systems, latent variables can cause user-item interaction data to deviate from true user preferences. This biased data is then used to train recommendation models, further amplifying the bias and ultimately compromising both recommendation accuracy and user satisfaction. Instrumental Variable (IV) methods are effective tools for addressing the confounding bias introduced by latent variables; however, identifying a valid IV is often challenging. To overcome this issue, we propose a novel data-driven conditional IV (CIV) debiasing method for recommender systems, called CIV4Rec. CIV4Rec automatically generates valid CIVs and their corresponding conditioning sets directly from interaction data, significantly reducing the complexity of IV selection while effectively mitigating the confounding bias caused by latent variables in recommender systems. Specifically, CIV4Rec leverages a variational autoencoder (VAE) to generate the representations of the CIV and its conditional set from interaction data, followed by the application of least squares to derive causal representations for click prediction. Extensive experiments on two real-world datasets, Movielens-10M and Douban-Movie, demonstrate that our CIV4Rec successfully identifies valid CIVs, effectively reduces bias, and consequently improves recommendation accuracy.
Abstract:Popular methods usually use a degradation model in a supervised way to learn a watermark removal model. However, it is true that reference images are difficult to obtain in the real world, as well as collected images by cameras suffer from noise. To overcome these drawbacks, we propose a perceptive self-supervised learning network for noisy image watermark removal (PSLNet) in this paper. PSLNet depends on a parallel network to remove noise and watermarks. The upper network uses task decomposition ideas to remove noise and watermarks in sequence. The lower network utilizes the degradation model idea to simultaneously remove noise and watermarks. Specifically, mentioned paired watermark images are obtained in a self supervised way, and paired noisy images (i.e., noisy and reference images) are obtained in a supervised way. To enhance the clarity of obtained images, interacting two sub-networks and fusing obtained clean images are used to improve the effects of image watermark removal in terms of structural information and pixel enhancement. Taking into texture information account, a mixed loss uses obtained images and features to achieve a robust model of noisy image watermark removal. Comprehensive experiments show that our proposed method is very effective in comparison with popular convolutional neural networks (CNNs) for noisy image watermark removal. Codes can be obtained at https://github.com/hellloxiaotian/PSLNet.
Abstract:Deep convolutional neural networks (CNNs) depend on feedforward and feedback ways to obtain good performance in image denoising. However, how to obtain effective structural information via CNNs to efficiently represent given noisy images is key for complex scenes. In this paper, we propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual block (RB) to obtain clean images for complex scenes. A SB uses an enhanced residual architecture to deeply search structural information for image denoising. To avoid loss of key information, PB uses three heterogeneous networks to implement multiple interactions of multi-level features to broadly search for extra information for improving the adaptability of an obtained denoiser for complex scenes. Also, to improve denoising performance, Transformer mechanisms are embedded into the SB and PB to extract complementary salient features for effectively removing noise in terms of pixel relations. Finally, a RB is applied to acquire clean images. Experiments illustrate that our CTNet is superior to some popular denoising methods in terms of real and synthetic image denoising. It is suitable to mobile digital devices, i.e., phones. Codes can be obtained at https://github.com/hellloxiaotian/CTNet.