Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fred Ngolè Mboula

Online Multi-Source Domain Adaptation through Gaussian Mixtures and Dataset Dictionary Learning

Jul 29, 2024

Eduardo Fernandes Montesuma, Stevan Le Stanc, Fred Ngolè Mboula

Figure 1 for Online Multi-Source Domain Adaptation through Gaussian Mixtures and Dataset Dictionary Learning

Figure 2 for Online Multi-Source Domain Adaptation through Gaussian Mixtures and Dataset Dictionary Learning

Figure 3 for Online Multi-Source Domain Adaptation through Gaussian Mixtures and Dataset Dictionary Learning

Abstract:This paper addresses the challenge of online multi-source domain adaptation (MSDA) in transfer learning, a scenario where one needs to adapt multiple, heterogeneous source domains towards a target domain that comes in a stream. We introduce a novel approach for the online fit of a Gaussian Mixture Model (GMM), based on the Wasserstein geometry of Gaussian measures. We build upon this method and recent developments in dataset dictionary learning for proposing a novel strategy in online MSDA. Experiments on the challenging Tennessee Eastman Process benchmark demonstrate that our approach is able to adapt \emph{on the fly} to the stream of target domain data. Furthermore, our online GMM serves as a memory, representing the whole stream of data.

* 6 pages, 3 figures, accepted at the IEEE International Workshop on Machine Learning for Signal Processing 2024

Via

Access Paper or Ask Questions

Dataset Dictionary Learning in a Wasserstein Space for Federated Domain Adaptation

Jul 16, 2024

Eduardo Fernandes Montesuma, Fabiola Espinoza Castellon, Fred Ngolè Mboula, Aurélien Mayoue, Antoine Souloumiac, Cédric Gouy-Pailler

Figure 1 for Dataset Dictionary Learning in a Wasserstein Space for Federated Domain Adaptation

Figure 2 for Dataset Dictionary Learning in a Wasserstein Space for Federated Domain Adaptation

Figure 3 for Dataset Dictionary Learning in a Wasserstein Space for Federated Domain Adaptation

Figure 4 for Dataset Dictionary Learning in a Wasserstein Space for Federated Domain Adaptation

Abstract:Multi-Source Domain Adaptation (MSDA) is a challenging scenario where multiple related and heterogeneous source datasets must be adapted to an unlabeled target dataset. Conventional MSDA methods often overlook that data holders may have privacy concerns, hindering direct data sharing. In response, decentralized MSDA has emerged as a promising strategy to achieve adaptation without centralizing clients' data. Our work proposes a novel approach, Decentralized Dataset Dictionary Learning, to address this challenge. Our method leverages Wasserstein barycenters to model the distributional shift across multiple clients, enabling effective adaptation while preserving data privacy. Specifically, our algorithm expresses each client's underlying distribution as a Wasserstein barycenter of public atoms, weighted by private barycentric coordinates. Our approach ensures that the barycentric coordinates remain undisclosed throughout the adaptation process. Extensive experimentation across five visual domain adaptation benchmarks demonstrates the superiority of our strategy over existing decentralized MSDA techniques. Moreover, our method exhibits enhanced robustness to client parallelism while maintaining relative resilience compared to conventional decentralized MSDA methodologies.

* 17 pages,7 figures

Via

Access Paper or Ask Questions

Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport

Apr 16, 2024

Eduardo Fernandes Montesuma, Fred Ngolè Mboula, Antoine Souloumiac

Figure 1 for Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport

Figure 2 for Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport

Figure 3 for Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport

Figure 4 for Lighter, Better, Faster Multi-Source Domain Adaptation with Gaussian Mixture Models and Optimal Transport

Abstract:In this paper, we tackle Multi-Source Domain Adaptation (MSDA), a task in transfer learning where one adapts multiple heterogeneous, labeled source probability measures towards a different, unlabeled target measure. We propose a novel framework for MSDA, based on Optimal Transport (OT) and Gaussian Mixture Models (GMMs). Our framework has two key advantages. First, OT between GMMs can be solved efficiently via linear programming. Second, it provides a convenient model for supervised learning, especially classification, as components in the GMM can be associated with existing classes. Based on the GMM-OT problem, we propose a novel technique for calculating barycenters of GMMs. Based on this novel algorithm, we propose two new strategies for MSDA: GMM-WBT and GMM-DaDiL. We empirically evaluate our proposed methods on four benchmarks in image classification and fault diagnosis, showing that we improve over the prior art while being faster and involving fewer parameters.

* Under review

Via

Access Paper or Ask Questions

Multi-Source Domain Adaptation meets Dataset Distillation through Dataset Dictionary Learning

Sep 14, 2023

Eduardo Fernandes Montesuma, Fred Ngolè Mboula, Antoine Souloumiac

Abstract:In this paper, we consider the intersection of two problems in machine learning: Multi-Source Domain Adaptation (MSDA) and Dataset Distillation (DD). On the one hand, the first considers adapting multiple heterogeneous labeled source domains to an unlabeled target domain. On the other hand, the second attacks the problem of synthesizing a small summary containing all the information about the datasets. We thus consider a new problem called MSDA-DD. To solve it, we adapt previous works in the MSDA literature, such as Wasserstein Barycenter Transport and Dataset Dictionary Learning, as well as DD method Distribution Matching. We thoroughly experiment with this novel problem on four benchmarks (Caltech-Office 10, Tennessee-Eastman Process, Continuous Stirred Tank Reactor, and Case Western Reserve University), where we show that, even with as little as 1 sample per class, one achieves state-of-the-art adaptation performance.

* 7 pages,4 figures

Via

Access Paper or Ask Questions

Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation

Sep 14, 2023

Fabiola Espinosa Castellon, Eduardo Fernandes Montesuma, Fred Ngolè Mboula, Aurélien Mayoue, Antoine Souloumiac, Cédric Gouy-Pallier

Figure 1 for Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation

Figure 2 for Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation

Figure 3 for Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation

Figure 4 for Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation

Abstract:In this article, we propose an approach for federated domain adaptation, a setting where distributional shift exists among clients and some have unlabeled data. The proposed framework, FedDaDiL, tackles the resulting challenge through dictionary learning of empirical distributions. In our setting, clients' distributions represent particular domains, and FedDaDiL collectively trains a federated dictionary of empirical distributions. In particular, we build upon the Dataset Dictionary Learning framework by designing collaborative communication protocols and aggregation operations. The chosen protocols keep clients' data private, thus enhancing overall privacy compared to its centralized counterpart. We empirically demonstrate that our approach successfully generates labeled data on the target domain with extensive experiments on (i) Caltech-Office, (ii) TEP, and (iii) CWRU benchmarks. Furthermore, we compare our method to its centralized counterpart and other benchmarks in federated domain adaptation.

* 7 pages,2 figures

Via

Access Paper or Ask Questions

Multi-Source Domain Adaptation for Cross-Domain Fault Diagnosis of Chemical Processes

Aug 22, 2023

Eduardo Fernandes Montesuma, Michela Mulas, Fred Ngolè Mboula, Francesco Corona, Antoine Souloumiac

Abstract:Fault diagnosis is an essential component in process supervision. Indeed, it determines which kind of fault has occurred, given that it has been previously detected, allowing for appropriate intervention. Automatic fault diagnosis systems use machine learning for predicting the fault type from sensor readings. Nonetheless, these models are sensible to changes in the data distributions, which may be caused by changes in the monitored process, such as changes in the mode of operation. This scenario is known as Cross-Domain Fault Diagnosis (CDFD). We provide an extensive comparison of single and multi-source unsupervised domain adaptation (SSDA and MSDA respectively) algorithms for CDFD. We study these methods in the context of the Tennessee-Eastmann Process, a widely used benchmark in the chemical industry. We show that using multiple domains during training has a positive effect, even when no adaptation is employed. As such, the MSDA baseline improves over the SSDA baseline classification accuracy by 23% on average. In addition, under the multiple-sources scenario, we improve classification accuracy of the no adaptation setting by 8.4% on average.

* 18 pages,15 figures

Via

Access Paper or Ask Questions

Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space

Jul 27, 2023

Eduardo Fernandes Montesuma, Fred Ngolè Mboula, Antoine Souloumiac

Figure 1 for Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space

Figure 2 for Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space

Figure 3 for Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space

Figure 4 for Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space

Abstract:This paper seeks to solve Multi-Source Domain Adaptation (MSDA), which aims to mitigate data distribution shifts when transferring knowledge from multiple labeled source domains to an unlabeled target domain. We propose a novel MSDA framework based on dictionary learning and optimal transport. We interpret each domain in MSDA as an empirical distribution. As such, we express each domain as a Wasserstein barycenter of dictionary atoms, which are empirical distributions. We propose a novel algorithm, DaDiL, for learning via mini-batches: (i) atom distributions; (ii) a matrix of barycentric coordinates. Based on our dictionary, we propose two novel methods for MSDA: DaDil-R, based on the reconstruction of labeled samples in the target domain, and DaDiL-E, based on the ensembling of classifiers learned on atom distributions. We evaluate our methods in 3 benchmarks: Caltech-Office, Office 31, and CRWU, where we improved previous state-of-the-art by 3.15%, 2.29%, and 7.71% in classification performance. Finally, we show that interpolations in the Wasserstein hull of learned atoms provide data that can generalize to the target domain.

* 13 pages,9 figures,Accepted as a conference paper at the 26th European Conference on Artificial Intelligence

Via

Access Paper or Ask Questions

Recent Advances in Optimal Transport for Machine Learning

Jun 28, 2023

Eduardo Fernandes Montesuma, Fred Ngolè Mboula, Antoine Souloumiac

Abstract:Recently, Optimal Transport has been proposed as a probabilistic framework in Machine Learning for comparing and manipulating probability distributions. This is rooted in its rich history and theory, and has offered new solutions to different problems in machine learning, such as generative modeling and transfer learning. In this survey we explore contributions of Optimal Transport for Machine Learning over the period 2012 -- 2022, focusing on four sub-fields of Machine Learning: supervised, unsupervised, transfer and reinforcement learning. We further highlight the recent development in computational Optimal Transport, and its interplay with Machine Learning practice.

* 20 pages,5 figures,under review

Via

Access Paper or Ask Questions