Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Towards Efficient Visual Adaption via Structural Re-parameterization

Feb 16, 2023

Gen Luo, Minglang Huang, Yiyi Zhou, Xiaoshuai Sun, Guannan Jiang, Zhiyu Wang, Rongrong Ji

Figure 1 for Towards Efficient Visual Adaption via Structural Re-parameterization

Figure 2 for Towards Efficient Visual Adaption via Structural Re-parameterization

Figure 3 for Towards Efficient Visual Adaption via Structural Re-parameterization

Figure 4 for Towards Efficient Visual Adaption via Structural Re-parameterization

Share this with someone who'll enjoy it:

Abstract:Parameter-efficient transfer learning (PETL) is an emerging research spot aimed at inexpensively adapting large-scale pre-trained models to downstream tasks. Recent advances have achieved great success in saving storage costs for various vision tasks by updating or injecting a small number of parameters instead of full fine-tuning. However, we notice that most existing PETL methods still incur non-negligible latency during inference. In this paper, we propose a parameter-efficient and computationally friendly adapter for giant vision models, called RepAdapter. Specifically, we prove that the adaption modules, even with a complex structure, can be seamlessly integrated into most giant vision models via structural re-parameterization. This property makes RepAdapter zero-cost during inference. In addition to computation efficiency, RepAdapter is more effective and lightweight than existing PETL methods due to its sparse structure and our careful deployment. To validate RepAdapter, we conduct extensive experiments on 27 benchmark datasets of three vision tasks, i.e., image and video classifications and semantic segmentation. Experimental results show the superior performance and efficiency of RepAdapter than the state-of-the-art PETL methods. For instance, by updating only 0.6% parameters, we can improve the performance of ViT from 38.8 to 55.1 on Sun397. Its generalizability is also well validated by a bunch of vision models, i.e., ViT, CLIP, Swin-Transformer and ConvNeXt. Our source code is released at https://github.com/luogen1996/RepAdapter.

View paper on

Share this with someone who'll enjoy it:

Title:Towards Efficient Visual Adaption via Structural Re-parameterization

Paper and Code