Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications

Jun 03, 2024

Rasoul Nikbakht, Mohamed Benzaghta, Giovanni Geraci

Figure 1 for TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications

Figure 2 for TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications

Figure 3 for TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications

Figure 4 for TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications

Share this with someone who'll enjoy it:

Abstract:Understanding telecom standards involves sorting through numerous technical documents, such as those produced by the 3rd Generation Partnership Project (3GPP), which is time-consuming and labor-intensive. While large language models (LLMs) can assist with the extensive 3GPP knowledge base, an inclusive dataset is crucial for their effective pre-training and fine-tuning. In this paper, we introduce \textit{TSpec-LLM}, an open-source comprehensive dataset covering all 3GPP documents from Release 8 to Release 19 (1999--2023). To evaluate its efficacy, we first select a representative sample of 3GPP documents, create corresponding technical questions, and assess the baseline performance of various LLMs. We then incorporate a retrieval-augmented generation (RAG) framework to enhance LLM capabilities by retrieving relevant context from the \textit{TSpec-LLM} dataset. Our evaluation shows that using a naive-RAG framework on \textit{TSpec-LLM} improves the accuracy of GPT-3.5, Gemini 1.0 Pro, and GPT-4 from 44\%, 46\%, and 51\% to 71\%, 75\%, and 72\%, respectively.

View paper on

Share this with someone who'll enjoy it:

Title:TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications

Paper and Code