Abstract:Causal structure learning has been extensively studied and widely used in machine learning and various applications. To achieve an ideal performance, existing causal structure learning algorithms often need to centralize a large amount of data from multiple data sources. However, in the privacy-preserving setting, it is impossible to centralize data from all sources and put them together as a single dataset. To preserve data privacy, federated learning as a new learning paradigm has attached much attention in machine learning in recent years. In this paper, we study a privacy-aware causal structure learning problem in the federated setting and propose a novel Federated PC (FedPC) algorithm with two new strategies for preserving data privacy without centralizing data. Specifically, we first propose a novel layer-wise aggregation strategy for a seamless adaptation of the PC algorithm into the federated learning paradigm for federated skeleton learning, then we design an effective strategy for learning consistent separation sets for federated edge orientation. The extensive experiments validate that FedPC is effective for causal structure learning in federated learning setting.
Abstract:Local causal structure learning aims to discover and distinguish direct causes (parents) and direct effects (children) of a variable of interest from data. While emerging successes have been made, existing methods need to search a large space to distinguish direct causes from direct effects of a target variable T. To tackle this issue, we propose a novel Efficient Local Causal Structure learning algorithm, named ELCS. Specifically, we first propose the concept of N-structures, then design an efficient Markov Blanket (MB) discovery subroutine to integrate MB learning with N-structures to learn the MB of T and simultaneously distinguish direct causes from direct effects of T. With the proposed MB subroutine, ELCS starts from the target variable, sequentially finds MBs of variables connected to the target variable and simultaneously constructs local causal structures over MBs until the direct causes and direct effects of the target variable have been distinguished. Using eight Bayesian networks the extensive experiments have validated that ELCS achieves better accuracy and efficiency than the state-of-the-art algorithms.
Abstract:Domain adaptation solves the learning problem in a target domain by leveraging the knowledge in a relevant source domain. While remarkable advances have been made, almost all existing domain adaptation methods heavily require large amounts of unlabeled target domain data for learning domain invariant representations to achieve good generalizability on the target domain. In fact, in many real-world applications, target domain data may not always be available. In this paper, we study the cases where at the training phase the target domain data is unavailable and only well-labeled source domain data is available, called robust domain adaptation. To tackle this problem, under the assumption that causal relationships between features and the class variable are robust across domains, we propose a novel Causal AutoEncoder (CAE), which integrates deep autoencoder and causal structure learning into a unified model to learn causal representations only using data from a single source domain. Specifically, a deep autoencoder model is adopted to learn low-dimensional representations, and a causal structure learning model is designed to separate the low-dimensional representations into two groups: causal representations and task-irrelevant representations. Using three real-world datasets the extensive experiments have validated the effectiveness of CAE compared to eleven state-of-the-art methods.