Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuki Markus Asano

VeRA: Vector-based Random Matrix Adaptation

Oct 17, 2023

Dawid Jan Kopiczko, Tijmen Blankevoort, Yuki Markus Asano

Figure 1 for VeRA: Vector-based Random Matrix Adaptation

Figure 2 for VeRA: Vector-based Random Matrix Adaptation

Figure 3 for VeRA: Vector-based Random Matrix Adaptation

Figure 4 for VeRA: Vector-based Random Matrix Adaptation

Abstract:Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models. In this work, we present Vector-based Random Matrix Adaptation (VeRA), which reduces the number of trainable parameters by 10x compared to LoRA, yet maintains the same performance. It achieves this by using a single pair of low-rank matrices shared across all layers and learning small scaling vectors instead. We demonstrate its effectiveness on the GLUE and E2E benchmarks, and show its application in instruction-following with just 1.4M parameters using the Llama2 7B model.

Via

Access Paper or Ask Questions

Self-labelling via simultaneous clustering and representation learning

Nov 26, 2019

Yuki Markus Asano, Christian Rupprecht, Andrea Vedaldi

Figure 1 for Self-labelling via simultaneous clustering and representation learning

Figure 2 for Self-labelling via simultaneous clustering and representation learning

Figure 3 for Self-labelling via simultaneous clustering and representation learning

Figure 4 for Self-labelling via simultaneous clustering and representation learning

Abstract:Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks. However, doing so naively leads to ill posed learning problems with degenerate solutions. In this paper, we propose a novel and principled learning formulation that addresses these issues. The method is obtained by maximizing the information between labels and input data indices. We show that this criterion extends standard cross-entropy minimization to an optimal transport problem, which we solve efficiently for millions of input images and thousands of labels using a fast variant of the Sinkhorn-Knopp algorithm. The resulting method is able to self-label visual data so as to train highly competitive image representations without manual labels. Our method achieves state of the art representation learning performance for AlexNet and ResNet-50 on SVHN, CIFAR-10, CIFAR-100 and ImageNet.

Via

Access Paper or Ask Questions