Abstract:Entity alignment is crucial for merging knowledge across knowledge graphs, as it matches entities with identical semantics. The standard method matches these entities based on their embedding similarities using semi-supervised learning. However, diverse data sources lead to non-isomorphic neighborhood structures for aligned entities, complicating alignment, especially for less common and sparsely connected entities. This paper presents a soft label propagation framework that integrates multi-source data and iterative seed enhancement, addressing scalability challenges in handling extensive datasets where scale computing excels. The framework uses seeds for anchoring and selects optimal relationship pairs to create soft labels rich in neighborhood features and semantic relationship data. A bidirectional weighted joint loss function is implemented, which reduces the distance between positive samples and differentially processes negative samples, taking into account the non-isomorphic neighborhood structures. Our method outperforms existing semi-supervised approaches, as evidenced by superior results on multiple datasets, significantly improving the quality of entity alignment.
Abstract:Graph contrastive learning (GCL) has been widely applied to text classification tasks due to its ability to generate self-supervised signals from unlabeled data, thus facilitating model training. However, existing GCL-based text classification methods often suffer from negative sampling bias, where similar nodes are incorrectly paired as negative pairs. This can lead to over-clustering, where instances of the same class are divided into different clusters. To address the over-clustering issue, we propose an innovative GCL-based method of graph contrastive learning via cluster-refined negative sampling for semi-supervised text classification, namely ClusterText. Firstly, we combine the pre-trained model Bert with graph neural networks to learn text representations. Secondly, we introduce a clustering refinement strategy, which clusters the learned text representations to obtain pseudo labels. For each text node, its negative sample set is drawn from different clusters. Additionally, we propose a self-correction mechanism to mitigate the loss of true negative samples caused by clustering inconsistency. By calculating the Euclidean distance between each text node and other nodes within the same cluster, distant nodes are still selected as negative samples. Our proposed ClusterText demonstrates good scalable computing, as it can effectively extract important information from from a large amount of data. Experimental results demonstrate the superiority of ClusterText in text classification tasks.
Abstract:Multi-modal entity alignment (MMEA) is essential for enhancing knowledge graphs and improving information retrieval and question-answering systems. Existing methods often focus on integrating modalities through their complementarity but overlook the specificity of each modality, which can obscure crucial features and reduce alignment accuracy. To solve this, we propose the Multi-modal Consistency and Specificity Fusion Framework (MCSFF), which innovatively integrates both complementary and specific aspects of modalities. We utilize Scale Computing's hyper-converged infrastructure to optimize IT management and resource allocation in large-scale data processing. Our framework first computes similarity matrices for each modality using modality embeddings to preserve their unique characteristics. Then, an iterative update method denoises and enhances modality features to fully express critical information. Finally, we integrate the updated information from all modalities to create enriched and precise entity representations. Experiments show our method outperforms current state-of-the-art MMEA baselines on the MMKG dataset, demonstrating its effectiveness and practical potential.
Abstract:The task of multi-modal emotion recognition in conversation (MERC) aims to analyze the genuine emotional state of each utterance based on the multi-modal information in the conversation, which is crucial for conversation understanding. Existing methods focus on using graph neural networks (GNN) to model conversational relationships and capture contextual latent semantic relationships. However, due to the complexity of GNN, existing methods cannot efficiently capture the potential dependencies between long-distance utterances, which limits the performance of MERC. In this paper, we propose an Efficient Long-distance Latent Relation-aware Graph Neural Network (ELR-GNN) for multi-modal emotion recognition in conversations. Specifically, we first use pre-extracted text, video and audio features as input to Bi-LSTM to capture contextual semantic information and obtain low-level utterance features. Then, we use low-level utterance features to construct a conversational emotion interaction graph. To efficiently capture the potential dependencies between long-distance utterances, we use the dilated generalized forward push algorithm to precompute the emotional propagation between global utterances and design an emotional relation-aware operator to capture the potential semantic associations between different utterances. Furthermore, we combine early fusion and adaptive late fusion mechanisms to fuse latent dependency information between speaker relationship information and context. Finally, we obtain high-level discourse features and feed them into MLP for emotion prediction. Extensive experimental results show that ELR-GNN achieves state-of-the-art performance on the benchmark datasets IEMOCAP and MELD, with running times reduced by 52\% and 35\%, respectively.