Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Srijith Rajamohan

Redis

Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data

Apr 03, 2025

Waris Gill, Justin Cechmanek, Tyler Hutcherson, Srijith Rajamohan, Jen Agarwal, Muhammad Ali Gulzar, Manvinder Singh, Benoit Dion

Abstract:This report investigates enhancing semantic caching effectiveness by employing specialized, fine-tuned embedding models. Semantic caching relies on embedding similarity rather than exact key matching, presenting unique challenges in balancing precision, query latency, and computational efficiency. We propose leveraging smaller, domain-specific embedding models, fine-tuned with targeted real-world and synthetically generated datasets. Our empirical evaluations demonstrate that compact embedding models fine-tuned for just one epoch on specialized datasets significantly surpass both state-of-the-art open-source and proprietary alternatives in precision and recall. Moreover, we introduce a novel synthetic data generation pipeline for the semantic cache that mitigates the challenge of limited domain-specific annotated data, further boosting embedding performance. Our approach effectively balances computational overhead and accuracy, establishing a viable and efficient strategy for practical semantic caching implementations.

* Initial study on embedding fine tuning for semantic cache. It also explores synthetic data. Total pages are 12, including refrences

Via

Access Paper or Ask Questions

Ensemble based approach to quantifying uncertainty of LLM based classifications

Feb 12, 2025

Srijith Rajamohan, Ahmed Salhin, Josh Frazier, Rohit Kumar, Yu-Cheng Tsai, Todd Cook

Abstract:The output of Large Language Models (LLMs) are a function of the internal model's parameters and the input provided into the context window. The hypothesis presented here is that under a greedy sampling strategy the variance in the LLM's output is a function of the conceptual certainty embedded in the model's parametric knowledge, as well as the lexical variance in the input. Finetuning the model results in reducing the sensitivity of the model output to the lexical input variations. This is then applied to a classification problem and a probabilistic method is proposed for estimating the certainties of the predicted classes.

Via

Access Paper or Ask Questions

A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation

Aug 05, 2019

Srijith Rajamohan, Alana Romanella, Amit Ramesh

Figure 1 for A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation

Figure 2 for A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation

Figure 3 for A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation

Figure 4 for A Weakly-Supervised Attention-based Visualization Tool for Assessing Political Affiliation

Abstract:In this work, we seek to finetune a weakly-supervised expert-guided Deep Neural Network (DNN) for the purpose of determining political affiliations. In this context, stance detection is used for determining political affiliation or ideology which is framed in the form of relative proximities between entities in a low-dimensional space. An attention-based mechanism is used to provide model interpretability. A Deep Neural Network for Natural Language Understanding (NLU) using static and contextual embeddings is trained and evaluated. Various techniques to visualize the projections generated from the network are evaluated for visualization efficiency. An overview of the pipeline from data ingestion, processing and generation of visualization is given here. A web-based framework created to faciliate this interaction and exploration is presented here. Preliminary results of this study are summarized and future work is outlined.

* 8 pages

Via

Access Paper or Ask Questions