Abstract:Despite significant advancements in causal research on graphs and its application to cracking label imbalance, the role of edge features in detecting the causal effects within graphs has been largely overlooked, leaving existing methods with untapped potential for further performance gains. In this paper, we enhance the causal attention mechanism through effectively leveraging edge information to disentangle the causal subgraph from the original graph, as well as further utilizing edge features to reshape graph representations. Capturing more comprehensive causal signals, our design leads to improved performance on graph classification tasks with label imbalance issues. We evaluate our approach on real-word datasets PTC, Tox21, and ogbg-molhiv, observing improvements over baselines. Overall, we highlight the importance of edge features in graph causal detection and provide a promising direction for addressing label imbalance challenges in graph-level tasks. The model implementation details and the codes are available on https://github.com/fengrui-z/ECAL
Abstract:Federated Learning (FL) enables collaborative model training while preserving the privacy of raw data. A challenge in this framework is the fair and efficient valuation of data, which is crucial for incentivizing clients to contribute high-quality data in the FL task. In scenarios involving numerous data clients within FL, it is often the case that only a subset of clients and datasets are pertinent to a specific learning task, while others might have either a negative or negligible impact on the model training process. This paper introduces a novel privacy-preserving method for evaluating client contributions and selecting relevant datasets without a pre-specified training algorithm in an FL task. Our proposed approach FedBary, utilizes Wasserstein distance within the federated context, offering a new solution for data valuation in the FL framework. This method ensures transparent data valuation and efficient computation of the Wasserstein barycenter and reduces the dependence on validation datasets. Through extensive empirical experiments and theoretical analyses, we demonstrate the potential of this data valuation method as a promising avenue for FL research.