Abstract:Efficient data selection is crucial to accelerate the pretraining of large language models (LLMs). While various methods have been proposed to enhance data efficiency, limited research has addressed the inherent conflicts between these approaches to achieve optimal data selection for LLM pretraining. To tackle this problem, we propose a novel multi-agent collaborative data selection mechanism. In this framework, each data selection method serves as an independent agent, and an agent console is designed to dynamically integrate the information from all agents throughout the LLM training process. We conduct extensive empirical studies to evaluate our multi-agent framework. The experimental results demonstrate that our approach significantly improves data efficiency, accelerates convergence in LLM training, and achieves an average performance gain of 10.5% across multiple language model benchmarks compared to the state-of-the-art methods.
Abstract:Heterogeneous Graph Neural Networks (HGNNs) are a class of deep learning models designed specifically for heterogeneous graphs, which are graphs that contain different types of nodes and edges. This paper investigates the application of curriculum learning techniques to improve the performance and robustness of Heterogeneous Graph Neural Networks (GNNs). To better classify the quality of the data, we design a loss-aware training schedule, named LTS that measures the quality of every nodes of the data and incorporate the training dataset into the model in a progressive manner that increases difficulty step by step. LTS can be seamlessly integrated into various frameworks, effectively reducing bias and variance, mitigating the impact of noisy data, and enhancing overall accuracy. Our findings demonstrate the efficacy of curriculum learning in enhancing HGNNs capabilities for analyzing complex graph-structured data. The code is public at https: //github.com/LARS-research/CLGNN/.
Abstract:Graph Neural Networks (GNNs) have shown success in various fields for learning from graph-structured data. This paper investigates the application of ensemble learning techniques to improve the performance and robustness of Graph Neural Networks (GNNs). By training multiple GNN models with diverse initializations or architectures, we create an ensemble model named ELGNN that captures various aspects of the data and uses the Tree-Structured Parzen Estimator algorithm to determine the ensemble weights. Combining the predictions of these models enhances overall accuracy, reduces bias and variance, and mitigates the impact of noisy data. Our findings demonstrate the efficacy of ensemble learning in enhancing GNN capabilities for analyzing complex graph-structured data. The code is public at https://github.com/wongzhenhao/ELGNN.