Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shawn Lan

Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

Feb 26, 2024

Mingtian Zhang, Shawn Lan, Peter Hayes, David Barber

Figure 1 for Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

Figure 2 for Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

Figure 3 for Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

Figure 4 for Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

Abstract:Retrieval Augmented Generation (RAG) has emerged as an effective solution for mitigating hallucinations in Large Language Models (LLMs). The retrieval stage in RAG typically involves a pre-trained embedding model, which converts queries and passages into vectors to capture their semantics. However, a standard pre-trained embedding model may exhibit sub-optimal performance when applied to specific domain knowledge, necessitating fine-tuning. This paper addresses scenarios where the embeddings are only available from a black-box model. We introduce Model augmented fine-tuning (Mafin) -- a novel approach for fine-tuning a black-box embedding model by augmenting it with a trainable embedding model. Our results demonstrate that Mafin significantly enhances the performance of the black-box embeddings by only requiring the training of a small augmented model. We validate the effectiveness of our method on both labeled and unlabeled datasets, illustrating its broad applicability and efficiency.

Via

Access Paper or Ask Questions