Abstract:Federated Semi-supervised Learning (FSSL) combines techniques from both fields of federated and semi-supervised learning to improve the accuracy and performance of models in a distributed environment by using a small fraction of labeled data and a large amount of unlabeled data. Without the need to centralize all data in one place for training, it collect updates of model training after devices train models at local, and thus can protect the privacy of user data. However, during the federal training process, some of the devices fail to collect enough data for local training, while new devices will be included to the group training. This leads to an unbalanced global data distribution and thus affect the performance of the global model training. Most of the current research is focusing on class imbalance with a fixed number of classes, while little attention is paid to data imbalance with a variable number of classes. Therefore, in this paper, we propose Federated Semi-supervised Learning for Class Variable Imbalance (FCVI) to solve class variable imbalance. The class-variable learning algorithm is used to mitigate the data imbalance due to changes of the number of classes. Our scheme is proved to be significantly better than baseline methods, while maintaining client privacy.
Abstract:Advanced researches on connected vehicles have recently targeted to the integration of vehicle-to-everything (V2X) networks with Machine Learning (ML) tools and distributed decision making. Federated learning (FL) is emerging as a new paradigm to train machine learning (ML) models in distributed systems, including vehicles in V2X networks. Rather than sharing and uploading the training data to the server, the updating of model parameters (e.g., neural networks' weights and biases) is applied by large populations of interconnected vehicles, acting as local learners. Despite these benefits, the limitation of existing approaches is the centralized optimization which relies on a server for aggregation and fusion of local parameters, leading to the drawback of a single point of failure and scaling issues for increasing V2X network size. Meanwhile, in intelligent transport scenarios, data collected from onboard sensors are redundant, which degrades the performance of aggregation. To tackle these problems, we explore a novel idea of decentralized data processing and introduce a federated learning framework for in-network vehicles, C-DFL(Consensus based Decentralized Federated Learning), to tackle federated learning on connected vehicles and improve learning quality. Extensive simulations have been implemented to evaluate the performance of C-DFL, that demonstrates C-DFL outperforms the performance of conventional methods in all cases.