Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Patrice Béchard

Multi-task retriever fine-tuning for domain-specific and efficient RAG

Jan 08, 2025

Patrice Béchard, Orlando Marquez Ayala

Figure 1 for Multi-task retriever fine-tuning for domain-specific and efficient RAG

Figure 2 for Multi-task retriever fine-tuning for domain-specific and efficient RAG

Figure 3 for Multi-task retriever fine-tuning for domain-specific and efficient RAG

Figure 4 for Multi-task retriever fine-tuning for domain-specific and efficient RAG

Abstract:Retrieval-Augmented Generation (RAG) has become ubiquitous when deploying Large Language Models (LLMs), as it can address typical limitations such as generating hallucinated or outdated information. However, when building real-world RAG applications, practical issues arise. First, the retrieved information is generally domain-specific. Since it is computationally expensive to fine-tune LLMs, it is more feasible to fine-tune the retriever to improve the quality of the data included in the LLM input. Second, as more applications are deployed in the same real-world system, one cannot afford to deploy separate retrievers. Moreover, these RAG applications normally retrieve different kinds of data. Our solution is to instruction fine-tune a small retriever encoder on a variety of domain-specific tasks to allow us to deploy one encoder that can serve many use cases, thereby achieving low-cost, scalability, and speed. We show how this encoder generalizes to out-of-domain settings as well as to an unseen retrieval task on real-world enterprise use cases.

* 9 pages, 2 figures. Submitted to NAACL 2025 Industry Track

Via

Access Paper or Ask Questions

Generating a Low-code Complete Workflow via Task Decomposition and RAG

Nov 29, 2024

Orlando Marquez Ayala, Patrice Béchard

Figure 1 for Generating a Low-code Complete Workflow via Task Decomposition and RAG

Figure 2 for Generating a Low-code Complete Workflow via Task Decomposition and RAG

Figure 3 for Generating a Low-code Complete Workflow via Task Decomposition and RAG

Figure 4 for Generating a Low-code Complete Workflow via Task Decomposition and RAG

Abstract:AI technologies are moving rapidly from research to production. With the popularity of Foundation Models (FMs) that generate text, images, and video, AI-based systems are increasing their complexity. Compared to traditional AI-based software, systems employing FMs, or GenAI-based systems, are more difficult to design due to their scale and versatility. This makes it necessary to document best practices, known as design patterns in software engineering, that can be used across GenAI applications. Our first contribution is to formalize two techniques, Task Decomposition and Retrieval-Augmented Generation (RAG), as design patterns for GenAI-based systems. We discuss their trade-offs in terms of software quality attributes and comment on alternative approaches. We recommend to AI practitioners to consider these techniques not only from a scientific perspective but also from the standpoint of desired engineering properties such as flexibility, maintainability, safety, and security. As a second contribution, we describe our industry experience applying Task Decomposition and RAG to build a complex real-world GenAI application for enterprise users: Workflow Generation. The task of generating workflows entails generating a specific plan using data from the system environment, taking as input a user requirement. As these two patterns affect the entire AI development cycle, we explain how they impacted the dataset creation, model training, model evaluation, and deployment phases.

* Under review; 12 pages, 8 figures

Via

Access Paper or Ask Questions

Reducing hallucination in structured outputs via Retrieval-Augmented Generation

Apr 12, 2024

Patrice Béchard, Orlando Marquez Ayala

Figure 1 for Reducing hallucination in structured outputs via Retrieval-Augmented Generation

Figure 2 for Reducing hallucination in structured outputs via Retrieval-Augmented Generation

Figure 3 for Reducing hallucination in structured outputs via Retrieval-Augmented Generation

Figure 4 for Reducing hallucination in structured outputs via Retrieval-Augmented Generation

Abstract:A common and fundamental limitation of Generative AI (GenAI) is its propensity to hallucinate. While large language models (LLM) have taken the world by storm, without eliminating or at least reducing hallucinations, real-world GenAI systems may face challenges in user adoption. In the process of deploying an enterprise application that produces workflows based on natural language requirements, we devised a system leveraging Retrieval Augmented Generation (RAG) to greatly improve the quality of the structured output that represents such workflows. Thanks to our implementation of RAG, our proposed system significantly reduces hallucinations in the output and improves the generalization of our LLM in out-of-domain settings. In addition, we show that using a small, well-trained retriever encoder can reduce the size of the accompanying LLM, thereby making deployments of LLM-based systems less resource-intensive.

* To be presented at NAACL 2024. 11 pages and 4 figures

Via

Access Paper or Ask Questions