Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xing Tian

MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

Oct 23, 2024

Jingfan Zhang, Yi Zhao, Dan Chen, Xing Tian, Huanran Zheng, Wei Zhu

Figure 1 for MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

Figure 2 for MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

Figure 3 for MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

Figure 4 for MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning

Abstract:Low-rank adaptation (LoRA) and its mixture-of-experts (MOE) variants are highly effective parameter-efficient fine-tuning (PEFT) methods. However, they introduce significant latency in multi-tenant settings due to the LoRA modules and MOE routers added to multiple linear modules in the Transformer layer. To address this issue, we propose Mixture of Low-Rank Adaptation (MiLoRA), a novel and efficient LoRA variant. MiLoRA differs from previous MOE-style LoRA methods by considering each LoRA module as an expert and employing a prompt-aware routing mechanism. This mechanism calculates expert routing results once before generating the first new token and reuses these results for subsequent tokens, reducing latency. Extensive experiments and analysis on commonsense reasoning tasks, math reasoning tasks, and widely used LLM evaluation benchmarks demonstrate that MiLoRA consistently outperforms strong PEFT baselines with comparable tunable parameter budgets. Additionally, MiLoRA significantly reduces latency in multi-tenant settings compared to previous LoRA-based methods.

* Accepted by EMNLP 2024 Findings. arXiv admin note: substantial text overlap with arXiv:2405.18203

Via

Access Paper or Ask Questions

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

Mar 24, 2024

Zequan Liu, Jiawen Lyn, Wei Zhu, Xing Tian, Yvette Graham

Figure 1 for ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

Figure 2 for ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

Figure 3 for ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

Figure 4 for ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

Abstract:Parameter-efficient fine-tuning (PEFT) is widely studied for its effectiveness and efficiency in the era of large language models. Low-rank adaptation (LoRA) has demonstrated commendable performance as a popular and representative method. However, it is implemented with a fixed intrinsic rank that might not be the ideal setting for the downstream tasks. Recognizing the need for more flexible downstream task adaptation, we extend the methodology of LoRA to an innovative approach we call allocating low-rank adaptation (ALoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process. First, we propose a novel method, AB-LoRA, that can effectively estimate the importance score of each LoRA rank. Second, guided by AB-LoRA, we gradually prune abundant and negatively impacting LoRA ranks and allocate the pruned LoRA budgets to important Transformer modules needing higher ranks. We have conducted experiments on various tasks, and the experimental results demonstrate that our ALoRA method can outperform the recent baselines with comparable tunable parameters.

* Accepted by NAACL-2024

Via

Access Paper or Ask Questions

Text2MDT: Extracting Medical Decision Trees from Medical Texts

Jan 04, 2024

Wei Zhu, Wenfeng Li, Xing Tian, Pengfei Wang, Xiaoling Wang, Jin Chen, Yuanbin Wu, Yuan Ni, Guotong Xie

Abstract:Knowledge of the medical decision process, which can be modeled as medical decision trees (MDTs), is critical to build clinical decision support systems. However, the current MDT construction methods rely heavily on time-consuming and laborious manual annotation. In this work, we propose a novel task, Text2MDT, to explore the automatic extraction of MDTs from medical texts such as medical guidelines and textbooks. We normalize the form of the MDT and create an annotated Text-to-MDT dataset in Chinese with the participation of medical experts. We investigate two different methods for the Text2MDT tasks: (a) an end-to-end framework which only relies on a GPT style large language models (LLM) instruction tuning to generate all the node information and tree structures. (b) The pipeline framework which decomposes the Text2MDT task to three subtasks. Experiments on our Text2MDT dataset demonstrate that: (a) the end-to-end method basd on LLMs (7B parameters or larger) show promising results, and successfully outperform the pipeline methods. (b) The chain-of-thought (COT) prompting method \cite{Wei2022ChainOT} can improve the performance of the fine-tuned LLMs on the Text2MDT test set. (c) the lightweight pipelined method based on encoder-based pretrained models can perform comparably with LLMs with model complexity two magnititudes smaller. Our Text2MDT dataset is open-sourced at \url{https://tianchi.aliyun.com/dataset/95414}, and the source codes are open-sourced at \url{https://github.com/michael-wzhu/text2dt}.

Via

Access Paper or Ask Questions

Probabilistic Classification Vector Machine for Multi-Class Classification

Jun 29, 2020

Shengfei Lyu, Xing Tian, Yang Li, Bingbing Jiang, Huanhuan Chen

Figure 1 for Probabilistic Classification Vector Machine for Multi-Class Classification

Figure 2 for Probabilistic Classification Vector Machine for Multi-Class Classification

Figure 3 for Probabilistic Classification Vector Machine for Multi-Class Classification

Figure 4 for Probabilistic Classification Vector Machine for Multi-Class Classification

Abstract:The probabilistic classification vector machine (PCVM) synthesizes the advantages of both the support vector machine and the relevant vector machine, delivering a sparse Bayesian solution to classification problems. However, the PCVM is currently only applicable to binary cases. Extending the PCVM to multi-class cases via heuristic voting strategies such as one-vs-rest or one-vs-one often results in a dilemma where classifiers make contradictory predictions, and those strategies might lose the benefits of probabilistic outputs. To overcome this problem, we extend the PCVM and propose a multi-class probabilistic classification vector machine (mPCVM). Two learning algorithms, i.e., one top-down algorithm and one bottom-up algorithm, have been implemented in the mPCVM. The top-down algorithm obtains the maximum a posteriori (MAP) point estimates of the parameters based on an expectation-maximization algorithm, and the bottom-up algorithm is an incremental paradigm by maximizing the marginal likelihood. The superior performance of the mPCVMs, especially when the investigated problem has a large number of classes, is extensively evaluated on synthetic and benchmark data sets.

Via

Access Paper or Ask Questions