Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Aug 20, 2024

Yanjie Dong, Xiaoyi Fan, Fangxin Wang, Chengming Li, Victor C. M. Leung, Xiping Hu

Figure 1 for Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Figure 2 for Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Figure 3 for Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Figure 4 for Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Share this with someone who'll enjoy it:

Abstract:Since the invention of GPT2--1.5B in 2019, large language models (LLMs) have transitioned from specialized models to versatile foundation models. The LLMs exhibit impressive zero-shot ability, however, require fine-tuning on local datasets and significant resources for deployment. Traditional fine-tuning techniques with the first-order optimizers require substantial GPU memory that exceeds mainstream hardware capability. Therefore, memory-efficient methods are motivated to be investigated. Model compression techniques can reduce energy consumption, operational costs, and environmental impact so that to support sustainable artificial intelligence advancements. Additionally, large-scale foundation models have expanded to create images, audio, videos, and multi-modal contents, further emphasizing the need for efficient deployment. Therefore, we are motivated to present a comprehensive overview of the prevalent memory-efficient fine-tuning methods over the network edge. We also review the state-of-the-art literatures on model compression to provide a vision on deploying LLMs over the network edge.

View paper on

Share this with someone who'll enjoy it:

Title:Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

Paper and Code