Abstract:Federated learning (FL) aims to collaboratively learn deep learning model parameters from decentralized data archives (i.e., clients) without accessing training data on clients. However, the training data across clients might be not independent and identically distributed (non-IID), which may result in difficulty in achieving optimal model convergence. In this work, we investigate the capability of state-of-the-art transformer architectures (which are MLP-Mixer, ConvMixer, PoolFormer) to address the challenges related to non-IID training data across various clients in the context of FL for multi-label classification (MLC) problems in remote sensing (RS). The considered transformer architectures are compared among themselves and with the ResNet-50 architecture in terms of their: 1) robustness to training data heterogeneity; 2) local training complexity; and 3) aggregation complexity under different non-IID levels. The experimental results obtained on the BigEarthNet-S2 benchmark archive demonstrate that the considered architectures increase the generalization ability with the cost of higher local training and aggregation complexities. On the basis of our analysis, some guidelines are derived for a proper selection of transformer architecture in the context of FL for RS MLC. The code of this work is publicly available at https://git.tu-berlin.de/rsim/FL-Transformer.
Abstract:Federated learning (FL) enables the collaboration of multiple deep learning models to learn from decentralized data archives (i.e., clients) without accessing data on clients. Although FL offers ample opportunities in knowledge discovery from distributed image archives, it is seldom considered in remote sensing (RS). In this paper, as a first time in RS, we present a comparative study of state-of-the-art FL algorithms. To this end, we initially provide a systematic review of the FL algorithms presented in the computer vision community for image classification problems, and select several state-of-the-art FL algorithms based on their effectiveness with respect to training data heterogeneity across clients (known as non-IID data). After presenting an extensive overview of the selected algorithms, a theoretical comparison of the algorithms is conducted based on their: 1) local training complexity; 2) aggregation complexity; 3) learning efficiency; 4) communication cost; and 5) scalability in terms of number of clients. As the classification task, we consider multi-label classification (MLC) problem since RS images typically consist of multiple classes, and thus can simultaneously be associated with multi-labels. After the theoretical comparison, experimental analyses are presented to compare them under different decentralization scenarios in terms of MLC performance. Based on our comprehensive analyses, we finally derive a guideline for selecting suitable FL algorithms in RS. The code of this work will be publicly available at https://git.tu-berlin.de/rsim/FL-RS.
Abstract:The development of federated learning (FL) methods, which aim to learn from distributed databases (i.e., clients) without accessing data on clients, has recently attracted great attention. Most of these methods assume that the clients are associated with the same data modality. However, remote sensing (RS) images in different clients can be associated with different data modalities that can improve the classification performance when jointly used. To address this problem, in this paper we introduce a novel multi-modal FL framework that aims to learn from decentralized multi-modal RS image archives for RS image classification problems. The proposed framework is made up of three modules: 1) multi-modal fusion (MF); 2) feature whitening (FW); and 3) mutual information maximization (MIM). The MF module performs iterative model averaging to learn without accessing data on clients in the case that clients are associated with different data modalities. The FW module aligns the representations learned among the different clients. The MIM module maximizes the similarity of images from different modalities. Experimental results show the effectiveness of the proposed framework compared to iterative model averaging, which is a widely used algorithm in FL. The code of the proposed framework is publicly available at https://git.tu-berlin.de/rsim/MM-FL.