Abstract:Cross-device federated learning (FL) has been well-studied from algorithmic, system scalability, and training speed perspectives. Nonetheless, moving from centralized training to cross-device FL for millions or billions of devices presents many risks, including performance loss, developer inertia, poor user experience, and unexpected application failures. In addition, the corresponding infrastructure, development costs, and return on investment are difficult to estimate. In this paper, we present a device-cloud collaborative FL platform that integrates with an existing machine learning platform, providing tools to measure real-world constraints, assess infrastructure capabilities, evaluate model training performance, and estimate system resource requirements to responsibly bring FL into production. We also present a decision workflow that leverages the FL-integrated platform to comprehensively evaluate the trade-offs of cross-device FL and share our empirical evaluations of business-critical machine learning applications that impact hundreds of millions of users.
Abstract:Extreme multi-label classification aims to learn a classifier that annotates an instance with a relevant subset of labels from an extremely large label set. Many existing solutions embed the label matrix to a low-dimensional linear subspace, or examine the relevance of a test instance to every label via a linear scan. In practice, however, those approaches can be computationally exorbitant. To alleviate this drawback, we propose a Block-wise Partitioning (BP) pretreatment that divides all instances into disjoint clusters, to each of which the most frequently tagged label subset is attached. One multi-label classifier is trained on one pair of instance and label clusters, and the label set of a test instance is predicted by first delivering it to the most appropriate instance cluster. Experiments on benchmark multi-label data sets reveal that BP pretreatment significantly reduces prediction time, and retains almost the same level of prediction accuracy.