Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuwen Tan

Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer

Mar 29, 2024

Yuwen Tan, Qinhao Zhou, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li

Abstract:Class-incremental learning (CIL) aims to enable models to continuously learn new classes while overcoming catastrophic forgetting. The introduction of pre-trained models has brought new tuning paradigms to CIL. In this paper, we revisit different parameter-efficient tuning (PET) methods within the context of continual learning. We observe that adapter tuning demonstrates superiority over prompt-based methods, even without parameter expansion in each learning session. Motivated by this, we propose incrementally tuning the shared adapter without imposing parameter update constraints, enhancing the learning capacity of the backbone. Additionally, we employ feature sampling from stored prototypes to retrain a unified classifier, further improving its performance. We estimate the semantic shift of old prototypes without access to past samples and update stored prototypes session by session. Our proposed method eliminates model expansion and avoids retaining any image samples. It surpasses previous pre-trained model-based CIL methods and demonstrates remarkable continual learning capabilities. Experimental results on five CIL benchmarks validate the effectiveness of our approach, achieving state-of-the-art (SOTA) performance.

* To appear at CVPR 2024

Via

Access Paper or Ask Questions

Mapping Emulation for Knowledge Distillation

May 21, 2022

Jing Ma, Xiang Xiang, Zihan Zhang, Yuwen Tan, Yiming Wan, Zhigang Zeng, Dacheng Tao

Figure 1 for Mapping Emulation for Knowledge Distillation

Figure 2 for Mapping Emulation for Knowledge Distillation

Figure 3 for Mapping Emulation for Knowledge Distillation

Figure 4 for Mapping Emulation for Knowledge Distillation

Abstract:This paper formalizes the source-blind knowledge distillation problem that is essential to federated learning. A new geometric perspective is presented to view such a problem as aligning generated distributions between the teacher and student. With its guidance, a new architecture MEKD is proposed to emulate the inverse mapping through generative adversarial training. Unlike mimicking logits and aligning logit distributions, reconstructing the mapping from classifier-logits has a geometric intuition of decreasing empirical distances, and theoretical guarantees using the universal function approximation and optimal mass transportation theories. A new algorithm is also proposed to train the student model that reaches the teacher's performance source-blindly. On various benchmarks, MEKD outperforms existing source-blind KD methods, explainable with ablation studies and visualized results.

* 11 pages, 5 figures, 3 lemmas, 3 corollaries, 1 algorithm

Via

Access Paper or Ask Questions

Coarse-To-Fine Incremental Few-Shot Learning

Nov 24, 2021

Xiang Xiang, Yuwen Tan, Qian Wan, Jing Ma

Figure 1 for Coarse-To-Fine Incremental Few-Shot Learning

Figure 2 for Coarse-To-Fine Incremental Few-Shot Learning

Figure 3 for Coarse-To-Fine Incremental Few-Shot Learning

Figure 4 for Coarse-To-Fine Incremental Few-Shot Learning

Abstract:Different from fine-tuning models pre-trained on a large-scale dataset of preset classes, class-incremental learning (CIL) aims to recognize novel classes over time without forgetting pre-trained classes. However, a given model will be challenged by test images with finer-grained classes, e.g., a basenji is at most recognized as a dog. Such images form a new training set (i.e., support set) so that the incremental model is hoped to recognize a basenji (i.e., query) as a basenji next time. This paper formulates such a hybrid natural problem of coarse-to-fine few-shot (C2FS) recognition as a CIL problem named C2FSCIL, and proposes a simple, effective, and theoretically-sound strategy Knowe: to learn, normalize, and freeze a classifier's weights from fine labels, once learning an embedding space contrastively from coarse labels. Besides, as CIL aims at a stability-plasticity balance, new overall performance metrics are proposed. In that sense, on CIFAR-100, BREEDS, and tieredImageNet, Knowe outperforms all recent relevant CIL/FSCIL methods that are tailored to the new problem setting for the first time.

Via

Access Paper or Ask Questions