Abstract:Federated Learning (FL) has emerged as a pivotal framework for the development of effective global models (global FL) or personalized models (personalized FL) across clients with heterogeneous, non-iid data distribution. A key challenge in FL is client drift, where data heterogeneity impedes the aggregation of scattered knowledge. Recent studies have tackled the client drift issue by identifying significant divergence in the last classifier layer. To mitigate this divergence, strategies such as freezing the classifier weights and aligning the feature extractor accordingly have proven effective. Although the local alignment between classifier and feature extractor has been studied as a crucial factor in FL, we observe that it may lead the model to overemphasize the observed classes within each client. Thus, our objectives are twofold: (1) enhancing local alignment while (2) preserving the representation of unseen class samples. This approach aims to effectively integrate knowledge from individual clients, thereby improving performance for both global and personalized FL. To achieve this, we introduce a novel algorithm named FedDr+, which empowers local model alignment using dot-regression loss. FedDr+ freezes the classifier as a simplex ETF to align the features and improves aggregated global models by employing a feature distillation mechanism to retain information about unseen/missing classes. Consequently, we provide empirical evidence demonstrating that our algorithm surpasses existing methods that use a frozen classifier to boost alignment across the diverse distribution.
Abstract:Federated Learning (FL) is a collaborative method for training models while preserving data privacy in decentralized settings. However, FL encounters challenges related to data heterogeneity, which can result in performance degradation. In our study, we observe that as data heterogeneity increases, feature representation in the FedAVG model deteriorates more significantly compared to classifier weight. Additionally, we observe that as data heterogeneity increases, the gap between higher feature norms for observed classes, obtained from local models, and feature norms of unobserved classes widens, in contrast to the behavior of classifier weight norms. This widening gap extends to encompass the feature norm disparities between local and the global models. To address these issues, we introduce Federated Averaging with Feature Normalization Update (FedFN), a straightforward learning method. We demonstrate the superior performance of FedFN through extensive experiments, even when applied to pretrained ResNet18. Subsequently, we confirm the applicability of FedFN to foundation models.
Abstract:Early degradation prediction of lithium-ion batteries is crucial for ensuring safety and preventing unexpected failure in manufacturing and diagnostic processes. Long-term capacity trajectory predictions can fail due to cumulative errors and noise. To address this issue, this study proposes a data-centric method that uses early single-cycle data to predict the capacity degradation trajectory of lithium-ion cells. The method involves predicting a few knots at specific retention levels using a deep learning-based model and interpolating them to reconstruct the trajectory. Two approaches are used to identify the retention levels of two to four knots: uniformly dividing the retention up to the end of life and finding optimal locations using Bayesian optimization. The proposed model is validated with experimental data from 169 cells using five-fold cross-validation. The results show that mean absolute percentage errors in trajectory prediction are less than 1.60% for all cases of knots. By predicting only the cycle numbers of at least two knots based on early single-cycle charge and discharge data, the model can directly estimate the overall capacity degradation trajectory. Further experiments suggest using three-cycle input data to achieve robust and efficient predictions, even in the presence of noise. The method is then applied to predict various shapes of capacity degradation patterns using additional experimental data from 82 cells. The study demonstrates that collecting only the cycle information of a few knots during model training and a few early cycle data points for predictions is sufficient for predicting capacity degradation. This can help establish appropriate warranties or replacement cycles in battery manufacturing and diagnosis processes.
Abstract:The performance of deep neural networks is strongly influenced by the training dataset setup. In particular, when attributes having a strong correlation with the target attribute are present, the trained model can provide unintended prejudgments and show significant inference errors (i.e., the dataset bias problem). Various methods have been proposed to mitigate dataset bias, and their emphasis is on weakly correlated samples, called bias-conflicting samples. These methods are based on explicit bias labels involving human or empirical correlation metrics (e.g., training loss). However, such metrics require human costs or have insufficient theoretical explanation. In this study, we propose a debiasing algorithm, called PGD (Per-sample Gradient-based Debiasing), that comprises three steps: (1) training a model on uniform batch sampling, (2) setting the importance of each sample in proportion to the norm of the sample gradient, and (3) training the model using importance-batch sampling, whose probability is obtained in step (2). Compared with existing baselines for various synthetic and real-world datasets, the proposed method showed state-of-the-art accuracy for a the classification task. Furthermore, we describe theoretical understandings about how PGD can mitigate dataset bias.
Abstract:This paper proposes a fully unsupervised methodology for the reliable extraction of latent variables representing the characteristics of lithium-ion batteries (LIBs) from electrochemical impedance spectroscopy (EIS) data using information maximizing generative adversarial networks. Meaningful representations can be obtained from EIS data even when measured with direct current and without relaxation, which are difficult to express when using circuit models. The extracted latent variables were investigated as capacity degradation progressed and were used to estimate the discharge capacity of the batteries by employing Gaussian process regression. The proposed method was validated under various conditions of EIS data during charging and discharging. The results indicate that the proposed model provides more robust capacity estimations than the direct capacity estimations obtained from EIS. We demonstrate that the latent variables extracted from the EIS data measured with direct current and without relaxation reliably represent the degradation characteristics of LIBs.