Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

Feb 19, 2024

Song Guo, Fan Wu, Lei Zhang, Xiawu Zheng, Shengchuan Zhang, Fei Chao, Yiyu Shi, Rongrong Ji

Figure 1 for EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

Figure 2 for EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

Figure 3 for EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

Figure 4 for EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

Share this with someone who'll enjoy it:

Abstract:Existing methods for fine-tuning sparse LLMs often suffer from resource-intensive requirements and high retraining costs. Additionally, many fine-tuning methods often rely on approximations or heuristic optimization strategies, which may lead to suboptimal solutions. To address these issues, we propose an efficient and fast framework for fine-tuning sparse LLMs based on minimizing reconstruction error. Our approach involves sampling a small dataset for calibration and utilizing backpropagation to iteratively optimize block-wise reconstruction error, on a block-by-block basis, aiming for optimal solutions. Extensive experiments on various benchmarks consistently demonstrate the superiority of our method over other baselines. For instance, on the Wikitext2 dataset with LlamaV1-7B at 70% sparsity, our proposed EBFT achieves a perplexity of 16.88, surpassing the state-of-the-art DSnoT with a perplexity of 75.14. Moreover, with a structured sparsity ratio of 26\%, EBFT achieves a perplexity of 16.27, outperforming LoRA (perplexity 16.44). Furthermore, the fine-tuning process of EBFT for LlamaV1-7B only takes approximately 30 minutes, and the entire framework can be executed on a single 16GB GPU. The source code is available at https://github.com/sunggo/EBFT.

View paper on

Share this with someone who'll enjoy it:

Title:EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs

Paper and Code