Abstract:We present Project Florida, a system architecture and software development kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions across a heterogeneous device ecosystem. Federated learning is an approach to machine learning based on a strong data sovereignty principle, i.e., that privacy and security of data is best enabled by storing it at its origin, whether on end-user devices or in segregated cloud storage silos. Federated learning enables model training across devices and silos while the training data remains within its security boundary, by distributing a model snapshot to a client running inside the boundary, running client code to update the model, and then aggregating updated snapshots across many clients in a central orchestrator. Deploying a FL solution requires implementation of complex privacy and security mechanisms as well as scalable orchestration infrastructure. Scale and performance is a paramount concern, as the model training process benefits from full participation of many client devices, which may have a wide variety of performance characteristics. Project Florida aims to simplify the task of deploying cross-device FL solutions by providing cloud-hosted infrastructure and accompanying task management interfaces, as well as a multi-platform SDK supporting most major programming languages including C++, Java, and Python, enabling FL training across a wide range of operating system (OS) and hardware specifications. The architecture decouples service management from the FL workflow, enabling a cloud service provider to deliver FL-as-a-service (FLaaS) to ML engineers and application developers. We present an overview of Florida, including a description of the architecture, sample code, and illustrative experiments demonstrating system capabilities.
Abstract:In this paper we introduce "Federated Learning Utilities and Tools for Experimentation" (FLUTE), a high-performance open source platform for federated learning research and offline simulations. The goal of FLUTE is to enable rapid prototyping and simulation of new federated learning algorithms at scale, including novel optimization, privacy, and communications strategies. We describe the architecture of FLUTE, enabling arbitrary federated modeling schemes to be realized, we compare the platform with other state-of-the-art platforms, and we describe available features of FLUTE for experimentation in core areas of active research, such as optimization, privacy and scalability. We demonstrate the effectiveness of the platform with a series of experiments for text prediction and speech recognition, including the addition of differential privacy, quantization, scaling and a variety of optimization and federation approaches.