Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muhammad Anwar Ma'sum

PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning

Jul 30, 2024

Muhammad Anwar Ma'sum, Mahardhika Pratama, Savitha Ramasamy, Lin Liu, Habibullah Habibullah, Ryszard Kowalczyk

Figure 1 for PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning

Figure 2 for PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning

Figure 3 for PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning

Figure 4 for PIP: Prototypes-Injected Prompt for Federated Class Incremental Learning

Abstract:Federated Class Incremental Learning (FCIL) is a new direction in continual learning (CL) for addressing catastrophic forgetting and non-IID data distribution simultaneously. Existing FCIL methods call for high communication costs and exemplars from previous classes. We propose a novel rehearsal-free method for FCIL named prototypes-injected prompt (PIP) that involves 3 main ideas: a) prototype injection on prompt learning, b) prototype augmentation, and c) weighted Gaussian aggregation on the server side. Our experiment result shows that the proposed method outperforms the current state of the arts (SOTAs) with a significant improvement (up to 33%) in CIFAR100, MiniImageNet and TinyImageNet datasets. Our extensive analysis demonstrates the robustness of PIP in different task sizes, and the advantage of requiring smaller participating local clients, and smaller global rounds. For further study, source codes of PIP, baseline, and experimental logs are shared publicly in https://github.com/anwarmaxsum/PIP.

* Conference on Information and Knowledge Management (CIKM) 2024 (Accepted)

Via

Access Paper or Ask Questions

Cross-Domain Few-Shot Learning via Adaptive Transformer Networks

Jan 25, 2024

Naeem Paeedeh, Mahardhika Pratama, Muhammad Anwar Ma'sum, Wolfgang Mayer, Zehong Cao, Ryszard Kowlczyk

Abstract:Most few-shot learning works rely on the same domain assumption between the base and the target tasks, hindering their practical applications. This paper proposes an adaptive transformer network (ADAPTER), a simple but effective solution for cross-domain few-shot learning where there exist large domain shifts between the base task and the target task. ADAPTER is built upon the idea of bidirectional cross-attention to learn transferable features between the two domains. The proposed architecture is trained with DINO to produce diverse, and less biased features to avoid the supervision collapse problem. Furthermore, the label smoothing approach is proposed to improve the consistency and reliability of the predictions by also considering the predicted labels of the close samples in the embedding space. The performance of ADAPTER is rigorously evaluated in the BSCD-FSL benchmarks in which it outperforms prior arts with significant margins.

* Under Consideration in Knowledge-based Systems

Via

Access Paper or Ask Questions

Dynamic Long-Term Time-Series Forecasting via Meta Transformer Networks

Jan 25, 2024

Muhammad Anwar Ma'sum, MD Rasel Sarkar, Mahardhika Pratama, Savitha Ramasamy, Sreenatha Anavatti, Lin Liu, Habibullah, Ryszard Kowalczyk

Figure 1 for Dynamic Long-Term Time-Series Forecasting via Meta Transformer Networks

Figure 2 for Dynamic Long-Term Time-Series Forecasting via Meta Transformer Networks

Figure 3 for Dynamic Long-Term Time-Series Forecasting via Meta Transformer Networks

Figure 4 for Dynamic Long-Term Time-Series Forecasting via Meta Transformer Networks

Abstract:A reliable long-term time-series forecaster is highly demanded in practice but comes across many challenges such as low computational and memory footprints as well as robustness against dynamic learning environments. This paper proposes Meta-Transformer Networks (MANTRA) to deal with the dynamic long-term time-series forecasting tasks. MANTRA relies on the concept of fast and slow learners where a collection of fast learners learns different aspects of data distributions while adapting quickly to changes. A slow learner tailors suitable representations to fast learners. Fast adaptations to dynamic environments are achieved using the universal representation transformer layers producing task-adapted representations with a small number of parameters. Our experiments using four datasets with different prediction lengths demonstrate the advantage of our approach with at least $3\%$ improvements over the baseline algorithms for both multivariate and univariate settings. Source codes of MANTRA are publicly available in \url{https://github.com/anwarmaxsum/MANTRA}.

* Under Consideration in IEEE Transactions on Artificial Intelligence

Via

Access Paper or Ask Questions

Few-Shot Continual Learning via Flat-to-Wide Approaches

Jul 14, 2023

Muhammad Anwar Ma'sum, Mahardhika Pratama, Edwin Lughofer, Lin Liu, Habibullah, Ryszard Kowalczyk

Figure 1 for Few-Shot Continual Learning via Flat-to-Wide Approaches

Figure 2 for Few-Shot Continual Learning via Flat-to-Wide Approaches

Figure 3 for Few-Shot Continual Learning via Flat-to-Wide Approaches

Figure 4 for Few-Shot Continual Learning via Flat-to-Wide Approaches

Abstract:Existing approaches on continual learning call for a lot of samples in their training processes. Such approaches are impractical for many real-world problems having limited samples because of the overfitting problem. This paper proposes a few-shot continual learning approach, termed FLat-tO-WidE AppRoach (FLOWER), where a flat-to-wide learning process finding the flat-wide minima is proposed to address the catastrophic forgetting problem. The issue of data scarcity is overcome with a data augmentation approach making use of a ball generator concept to restrict the sampling space into the smallest enclosing ball. Our numerical studies demonstrate the advantage of FLOWER achieving significantly improved performances over prior arts notably in the small base tasks. For further study, source codes of FLOWER, competitor algorithms and experimental logs are shared publicly in \url{https://github.com/anwarmaxsum/FLOWER}.

Via

Access Paper or Ask Questions

Assessor-Guided Learning for Continual Environments

Mar 21, 2023

Muhammad Anwar Ma'sum, Mahardhika Pratama, Edwin Lughofer, Weiping Ding, Wisnu Jatmiko

Figure 1 for Assessor-Guided Learning for Continual Environments

Figure 2 for Assessor-Guided Learning for Continual Environments

Figure 3 for Assessor-Guided Learning for Continual Environments

Figure 4 for Assessor-Guided Learning for Continual Environments

Abstract:This paper proposes an assessor-guided learning strategy for continual learning where an assessor guides the learning process of a base learner by controlling the direction and pace of the learning process thus allowing an efficient learning of new environments while protecting against the catastrophic interference problem. The assessor is trained in a meta-learning manner with a meta-objective to boost the learning process of the base learner. It performs a soft-weighting mechanism of every sample accepting positive samples while rejecting negative samples. The training objective of a base learner is to minimize a meta-weighted combination of the cross entropy loss function, the dark experience replay (DER) loss function and the knowledge distillation loss function whose interactions are controlled in such a way to attain an improved performance. A compensated over-sampling (COS) strategy is developed to overcome the class imbalanced problem of the episodic memory due to limited memory budgets. Our approach, Assessor-Guided Learning Approach (AGLA), has been evaluated in the class-incremental and task-incremental learning problems. AGLA achieves improved performances compared to its competitors while the theoretical analysis of the COS strategy is offered. Source codes of AGLA, baseline algorithms and experimental logs are shared publicly in \url{https://github.com/anwarmaxsum/AGLA} for further study.

* Submitted for publication to Information Sciences

Via

Access Paper or Ask Questions