Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yunan Lu

LocalRQA: From Generating Data to Locally Training, Testing, and Deploying Retrieval-Augmented QA Systems

Mar 01, 2024

Xiao Yu, Yunan Lu, Zhou Yu

Abstract:Retrieval-augmented question-answering systems combine retrieval techniques with large language models to provide answers that are more accurate and informative. Many existing toolkits allow users to quickly build such systems using off-the-shelf models, but they fall short in supporting researchers and developers to customize the model training, testing, and deployment process. We propose LocalRQA, an open-source toolkit that features a wide selection of model training algorithms, evaluation methods, and deployment tools curated from the latest research. As a showcase, we build QA systems using online documentation obtained from Databricks and Faire's websites. We find 7B-models trained and deployed using LocalRQA reach a similar performance compared to using OpenAI's text-ada-002 and GPT-4-turbo.

Via

Access Paper or Ask Questions

Mixture of Soft Prompts for Controllable Data Generation

Mar 02, 2023

Derek Chen, Celine Lee, Yunan Lu, Domenic Rosati, Zhou Yu

Figure 1 for Mixture of Soft Prompts for Controllable Data Generation

Figure 2 for Mixture of Soft Prompts for Controllable Data Generation

Figure 3 for Mixture of Soft Prompts for Controllable Data Generation

Figure 4 for Mixture of Soft Prompts for Controllable Data Generation

Abstract:Large language models (LLMs) effectively generate fluent text when the target output follows natural language patterns. However, structured prediction tasks confine the output format to a limited ontology, causing even very large models to struggle since they were never trained with such restrictions in mind. The difficulty of using LLMs for direct prediction is exacerbated in few-shot learning scenarios, which commonly arise due to domain shift and resource limitations. We flip the problem on its head by leveraging the LLM as a tool for data augmentation rather than direct prediction. Our proposed Mixture of Soft Prompts (MSP) serves as a parameter-efficient procedure for generating data in a controlled manner. Denoising mechanisms are further applied to improve the quality of synthesized data. Automatic metrics show our method is capable of producing diverse and natural text, while preserving label semantics. Moreover, MSP achieves state-of-the-art results on three benchmarks when compared against strong baselines. Our method offers an alternate data-centric approach for applying LLMs to complex prediction tasks.

* 17 pages, Preprint

Via

Access Paper or Ask Questions