Abstract:Federated Learning has emerged as a promising approach to train machine learning models on decentralized data sources while preserving data privacy. This paper proposes a new federated approach for Naive Bayes (NB) classification, assuming discrete variables. Our approach federates a discriminative variant of NB, sharing meaningless parameters instead of conditional probability tables. Therefore, this process is more reliable against possible attacks. We conduct extensive experiments on 12 datasets to validate the efficacy of our approach, comparing federated and non-federated settings. Additionally, we benchmark our method against the generative variant of NB, which serves as a baseline for comparison. Our experimental results demonstrate the effectiveness of our method in achieving accurate classification.
Abstract:This article presents MCTS-BN, an adaptation of the Monte Carlo Tree Search (MCTS) algorithm for the structural learning of Bayesian Networks (BNs). Initially designed for game tree exploration, MCTS has been repurposed to address the challenge of learning BN structures by exploring the search space of potential ancestral orders in Bayesian Networks. Then, it employs Hill Climbing (HC) to derive a Bayesian Network structure from each order. In large BNs, where the search space for variable orders becomes vast, using completely random orders during the rollout phase is often unreliable and impractical. We adopt a semi-randomized approach to address this challenge by incorporating variable orders obtained from other heuristic search algorithms such as Greedy Equivalent Search (GES), PC, or HC itself. This hybrid strategy mitigates the computational burden and enhances the reliability of the rollout process. Experimental evaluations demonstrate the effectiveness of MCTS-BN in improving BNs generated by traditional structural learning algorithms, exhibiting robust performance even when base algorithm orders are suboptimal and surpassing the gold standard when provided with favorable orders.
Abstract:Bayesian Network (BN) structure learning traditionally centralizes data, raising privacy concerns when data is distributed across multiple entities. This research introduces Federated GES (FedGES), a novel Federated Learning approach tailored for BN structure learning in decentralized settings using the Greedy Equivalence Search (GES) algorithm. FedGES uniquely addresses privacy and security challenges by exchanging only evolving network structures, not parameters or data. It realizes collaborative model development, using structural fusion to combine the limited models generated by each client in successive iterations. A controlled structural fusion is also proposed to enhance client consensus when adding any edge. Experimental results on various BNs from {\sf bnlearn}'s BN Repository validate the effectiveness of FedGES, particularly in high-dimensional (a large number of variables) and sparse data scenarios, offering a practical and privacy-preserving solution for real-world BN structure learning.
Abstract:This article focuses on the supervised classification of datasets with a large number of variables and a small number of instances. This is the case, for example, for microarray data sets commonly used in bioinformatics. Complex classifiers that require estimating statistics over many variables are not suitable for this type of data. Probabilistic classifiers with low-order probability tables, e.g. NB and AODE, are good alternatives for dealing with this type of data. AODE usually improves NB in accuracy, but suffers from high spatial complexity since $k$ models, each with $n+1$ variables, are included in the AODE ensemble. In this paper, we propose MiniAnDE, an algorithm that includes only a small number of heterogeneous base classifiers in the ensemble, i.e., each model only includes a different subset of the $k$ predictive variables. Experimental evaluation shows that using MiniAnDE classifiers on microarray data is feasible and outperforms NB and other ensembles such as bagging and random forest.