Abstract:Federated learning (FL) is a new distributed learning paradigm, with privacy, utility, and efficiency as its primary pillars. Existing research indicates that it is unlikely to simultaneously attain infinitesimal privacy leakage, utility loss, and efficiency. Therefore, how to find an optimal trade-off solution is the key consideration when designing the FL algorithm. One common way is to cast the trade-off problem as a multi-objective optimization problem, i.e., the goal is to minimize the utility loss and efficiency reduction while constraining the privacy leakage not exceeding a predefined value. However, existing multi-objective optimization frameworks are very time-consuming, and do not guarantee the existence of the Pareto frontier, this motivates us to seek a solution to transform the multi-objective problem into a single-objective problem because it is more efficient and easier to be solved. To this end, we propose FedPAC, a unified framework that leverages PAC learning to quantify multiple objectives in terms of sample complexity, such quantification allows us to constrain the solution space of multiple objectives to a shared dimension, so that it can be solved with the help of a single-objective optimization algorithm. Specifically, we provide the results and detailed analyses of how to quantify the utility loss, privacy leakage, privacy-utility-efficiency trade-off, as well as the cost of the attacker from the PAC learning perspective.
Abstract:Federated Learning (FL) is a new machine learning framework, which enables millions of participants to collaboratively train machine learning model without compromising data privacy and security. Due to the independence and confidentiality of each client, FL does not guarantee that all clients are honest by design, which makes it vulnerable to adversarial attack naturally. In this paper, we focus on dynamic backdoor attacks under FL setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining a good performance on the main task, current existing studies are mainly focused on static backdoor attacks, that is the poison pattern injected is unchanged, however, FL is an online learning framework, and adversarial targets can be changed dynamically by attacker, traditional algorithms require learning a new targeted task from scratch, which could be computationally expensive and require a large number of adversarial training examples, to avoid this, we bridge meta-learning and backdoor attacks under FL setting, in which case we can learn a versatile model from previous experiences, and fast adapting to new adversarial tasks with a few of examples. We evaluate our algorithm on different datasets, and demonstrate that our algorithm can achieve good results with respect to dynamic backdoor attacks. To the best of our knowledge, this is the first paper that focus on dynamic backdoor attacks research under FL setting.
Abstract:Federated learning is a new machine learning framework which enables different parties to collaboratively train a model while protecting data privacy and security. Due to model complexity, network unreliability and connection in-stability, communication cost has became a major bottleneck for applying federated learning to real-world applications. Current existing strategies are either need to manual setting for hyper-parameters, or break up the original process into multiple steps, which make it hard to realize end-to-end implementation. In this paper, we propose a novel compression strategy called Residual Pooling Network (RPN). Our experiments show that RPN not only reduce data transmission effectively, but also achieve almost the same performance as compared to standard federated learning. Our new approach performs as an end-to-end procedure, which should be readily applied to all CNN-based model training scenarios for improvement of communication efficiency, and hence make it easy to deploy in real-world application without human intervention.
Abstract:Visual object detection is a computer vision-based artificial intelligence (AI) technique which has many practical applications (e.g., fire hazard monitoring). However, due to privacy concerns and the high cost of transmitting video data, it is highly challenging to build object detection models on centrally stored large training datasets following the current approach. Federated learning (FL) is a promising approach to resolve this challenge. Nevertheless, there currently lacks an easy to use tool to enable computer vision application developers who are not experts in federated learning to conveniently leverage this technology and apply it in their systems. In this paper, we report FedVision - a machine learning engineering platform to support the development of federated learning powered computer vision applications. The platform has been deployed through a collaboration between WeBank and Extreme Vision to help customers develop computer vision-based safety monitoring solutions in smart city applications. Over four months of usage, it has achieved significant efficiency improvement and cost reduction while removing the need to transmit sensitive data for three major corporate customers. To the best of our knowledge, this is the first real application of FL in computer vision-based tasks.
Abstract:Federated learning is a new machine learning paradigm which allows data parties to build machine learning models collaboratively while keeping their data secure and private. While research efforts on federated learning have been growing tremendously in the past two years, most existing works still depend on pre-existing public datasets and artificial partitions to simulate data federations due to the lack of high-quality labeled data generated from real-world edge applications. Consequently, advances on benchmark and model evaluations for federated learning have been lagging behind. In this paper, we introduce a real-world image dataset. The dataset contains more than 900 images generated from 26 street cameras and 7 object categories annotated with detailed bounding box. The data distribution is non-IID and unbalanced, reflecting the characteristic real-world federated learning scenarios. Based on this dataset, we implemented two mainstream object detection algorithms (YOLO and Faster R-CNN) and provided an extensive benchmark on model performance, efficiency, and communication in a federated learning setting. Both the dataset and algorithms are made publicly available.