Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Maosheng Xiang

Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification

Jan 25, 2025

Zhongqi Wang, Jia Dai, Kai Li, Xu Li, Yanmeng Guo, Maosheng Xiang

Figure 1 for Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification

Figure 2 for Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification

Figure 3 for Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification

Figure 4 for Complementary Subspace Low-Rank Adaptation of Vision-Language Models for Few-Shot Classification

Abstract:Vision language model (VLM) has been designed for large scale image-text alignment as a pretrained foundation model. For downstream few shot classification tasks, parameter efficient fine-tuning (PEFT) VLM has gained much popularity in the computer vision community. PEFT methods like prompt tuning and linear adapter have been studied for fine-tuning VLM while low rank adaptation (LoRA) algorithm has rarely been considered for few shot fine-tuning VLM. The main obstacle to use LoRA for few shot fine-tuning is the catastrophic forgetting problem. Because the visual language alignment knowledge is important for the generality in few shot learning, whereas low rank adaptation interferes with the most informative direction of the pretrained weight matrix. We propose the complementary subspace low rank adaptation (Comp-LoRA) method to regularize the catastrophic forgetting problem in few shot VLM finetuning. In detail, we optimize the low rank matrix in the complementary subspace, thus preserving the general vision language alignment ability of VLM when learning the novel few shot information. We conduct comparison experiments of the proposed Comp-LoRA method and other PEFT methods on fine-tuning VLM for few shot classification. And we also present the suppression on the catastrophic forgetting problem of our proposed method against directly applying LoRA to VLM. The results show that the proposed method surpasses the baseline method by about +1.0\% Top-1 accuracy and preserves the VLM zero-shot performance over the baseline method by about +1.3\% Top-1 accuracy.

* Preprint version

Via

Access Paper or Ask Questions

Plug-and-Play Algorithm Convergence Analysis From The Standpoint of Stochastic Differential Equation

Apr 22, 2024

Zhongqi Wang, Bingnan Wang, Maosheng Xiang

Figure 1 for Plug-and-Play Algorithm Convergence Analysis From The Standpoint of Stochastic Differential Equation

Figure 2 for Plug-and-Play Algorithm Convergence Analysis From The Standpoint of Stochastic Differential Equation

Figure 3 for Plug-and-Play Algorithm Convergence Analysis From The Standpoint of Stochastic Differential Equation

Figure 4 for Plug-and-Play Algorithm Convergence Analysis From The Standpoint of Stochastic Differential Equation

Abstract:The Plug-and-Play (PnP) algorithm is popular for inverse image problem-solving. However, this algorithm lacks theoretical analysis of its convergence with more advanced plug-in denoisers. We demonstrate that discrete PnP iteration can be described by a continuous stochastic differential equation (SDE). We can also achieve this transformation through Markov process formulation of PnP. Then, we can take a higher standpoint of PnP algorithms from stochastic differential equations, and give a unified framework for the convergence property of PnP according to the solvability condition of its corresponding SDE. We reveal that a much weaker condition, bounded denoiser with Lipschitz continuous measurement function would be enough for its convergence guarantee, instead of previous Lipschitz continuous denoiser condition.

* 17pages, Preprint, Under review

Via

Access Paper or Ask Questions