Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Veronica Chatrath

FedRAG: A Framework for Fine-Tuning Retrieval-Augmented Generation Systems

Jun 12, 2025

Val Andrei Fajardo, David B. Emerson, Amandeep Singh, Veronica Chatrath, Marcelo Lotif, Ravi Theja, Alex Cheung, Izuki Matsuba

Abstract:Retrieval-augmented generation (RAG) systems have been shown to be effective in addressing many of the drawbacks of relying solely on the parametric memory of large language models. Recent work has demonstrated that RAG systems can be improved via fine-tuning of their retriever and generator models. In this work, we introduce FedRAG, a framework for fine-tuning RAG systems across centralized and federated architectures. FedRAG supports state-of-the-art fine-tuning methods, offering a simple and intuitive interface and a seamless conversion from centralized to federated training tasks. FedRAG is also deeply integrated with the modern RAG ecosystem, filling a critical gap in available tools.

* 9 pages, 4 figures, 2 tables. Accepted for the CODEML Workshop at ICML 2025. Framework code available at https://github.com/VectorInstitute/fed-rag

Via

Access Paper or Ask Questions

ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Dec 22, 2024

Shaina Raza, Caesar Saleh, Emrul Hasan, Franklin Ogidi, Maximus Powers, Veronica Chatrath, Marcelo Lotif, Roya Javadi, Anam Zahid, Vahid Reza Khazaie

Figure 1 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Figure 2 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Figure 3 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Figure 4 for ViLBias: A Framework for Bias Detection using Linguistic and Visual Cues

Abstract:The integration of Large Language Models (LLMs) and Vision-Language Models (VLMs) opens new avenues for addressing complex challenges in multimodal content analysis, particularly in biased news detection. This study introduces ViLBias, a framework that leverages state of the art LLMs and VLMs to detect linguistic and visual biases in news content, addressing the limitations of traditional text-only approaches. Our contributions include a novel dataset pairing textual content with accompanying visuals from diverse news sources and a hybrid annotation framework, combining LLM-based annotations with human review to enhance quality while reducing costs and improving scalability. We evaluate the efficacy of LLMs and VLMs in identifying biases, revealing their strengths in detecting subtle framing and text-visual inconsistencies. Empirical analysis demonstrates that incorporating visual cues alongside text enhances bias detection accuracy by 3 to 5 %, showcasing the complementary strengths of LLMs in generative reasoning and Small Language Models (SLMs) in classification. This study offers a comprehensive exploration of LLMs and VLMs as tools for detecting multimodal biases in news content, highlighting both their potential and limitations. Our research paves the way for more robust, scalable, and nuanced approaches to media bias detection, contributing to the broader field of natural language processing and multimodal analysis. (The data and code will be made available for research purposes).

* Under review

Via

Access Paper or Ask Questions

Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Nov 08, 2024

Veronica Chatrath, Marcelo Lotif, Shaina Raza

Figure 1 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Figure 2 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Figure 3 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Figure 4 for Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?

Abstract:Political misinformation poses significant challenges to democratic processes, shaping public opinion and trust in media. Manual fact-checking methods face issues of scalability and annotator bias, while machine learning models require large, costly labelled datasets. This study investigates the use of state-of-the-art large language models (LLMs) as reliable annotators for detecting political factuality in news articles. Using open-source LLMs, we create a politically diverse dataset, labelled for bias through LLM-generated annotations. These annotations are validated by human experts and further evaluated by LLM-based judges to assess the accuracy and reliability of the annotations. Our approach offers a scalable and robust alternative to traditional fact-checking, enhancing transparency and public trust in media.

* Accepted at Socially Responsible Language Modelling Research (SoLaR) Workshop at NeurIPS 2024

Via

Access Paper or Ask Questions

MBIAS: Mitigating Bias in Large Language Models While Retaining Context

May 22, 2024

Shaina Raza, Ananya Raval, Veronica Chatrath

Figure 1 for MBIAS: Mitigating Bias in Large Language Models While Retaining Context

Figure 2 for MBIAS: Mitigating Bias in Large Language Models While Retaining Context

Figure 3 for MBIAS: Mitigating Bias in Large Language Models While Retaining Context

Figure 4 for MBIAS: Mitigating Bias in Large Language Models While Retaining Context

Abstract:In addressing the critical need for safety in Large Language Models (LLMs), it is crucial to ensure that the outputs are not only safe but also retain their contextual accuracy. Many existing LLMs are safe fine-tuned either with safety demonstrations, or rely only on adversarial testing. While able to get safe outputs, they often risk losing contextual meaning as they mitigate bias and toxicity. In response, we present MBIAS, a LLM framework instruction fine-tuned on a custom dataset specifically designed for safety interventions. MBIAS aims to address the significant issues of bias and toxicity in LLMs generations that typically manifest as underrepresentation or negative portrayals across various demographics, including inappropriate linguistic mentions and biased content in social media. We experiment on MBIAS for safety interventions using various configurations, and demonstrate more than a 30\% reduction in overall bias and toxicity while successfully retaining key information. Additionally, a demographic analysis on an out-of-distribution test set confirms the robustness of our approach, with reductions in bias and toxicity exceeding 90\% across various demographics. The dataset and instruction fine-tuned MBIAS are made available to the research community at https://huggingface.co/newsmediabias/MBIAS.

Via

Access Paper or Ask Questions

Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

Apr 22, 2024

Jingxing Qian, Siqi Zhou, Nicholas Jianrui Ren, Veronica Chatrath, Angela P. Schoellig

Figure 1 for Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

Figure 2 for Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

Figure 3 for Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

Figure 4 for Closing the Perception-Action Loop for Semantically Safe Navigation in Semi-Static Environments

Abstract:Autonomous robots navigating in changing environments demand adaptive navigation strategies for safe long-term operation. While many modern control paradigms offer theoretical guarantees, they often assume known extrinsic safety constraints, overlooking challenges when deployed in real-world environments where objects can appear, disappear, and shift over time. In this paper, we present a closed-loop perception-action pipeline that bridges this gap. Our system encodes an online-constructed dense map, along with object-level semantic and consistency estimates into a control barrier function (CBF) to regulate safe regions in the scene. A model predictive controller (MPC) leverages the CBF-based safety constraints to adapt its navigation behaviour, which is particularly crucial when potential scene changes occur. We test the system in simulations and real-world experiments to demonstrate the impact of semantic information and scene change handling on robot behavior, validating the practicality of our approach.

* Manuscript accepted to ICRA 2024

Via

Access Paper or Ask Questions

FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections

Mar 14, 2024

Shaina Raza, Tahniat Khan, Drai Paulen-Patterson, Veronica Chatrath, Mizanur Rahman, Oluwanifemi Bamgbose

Figure 1 for FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections

Figure 2 for FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections

Figure 3 for FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections

Figure 4 for FakeWatch: A Framework for Detecting Fake News to Ensure Credible Elections

Abstract:In today's technologically driven world, the rapid spread of fake news, particularly during critical events like elections, poses a growing threat to the integrity of information. To tackle this challenge head-on, we introduce FakeWatch, a comprehensive framework carefully designed to detect fake news. Leveraging a newly curated dataset of North American election-related news articles, we construct robust classification models. Our framework integrates a model hub comprising of both traditional machine learning (ML) techniques and cutting-edge Language Models (LMs) to discern fake news effectively. Our overarching objective is to provide the research community with adaptable and precise classification models adept at identifying the ever-evolving landscape of misinformation. Quantitative evaluations of fake news classifiers on our dataset reveal that, while state-of-the-art LMs exhibit a slight edge over traditional ML models, classical models remain competitive due to their balance of accuracy and computational efficiency. Additionally, qualitative analyses shed light on patterns within fake news articles. This research lays the groundwork for future endeavors aimed at combating misinformation, particularly concerning electoral processes. We provide our labeled data and model publicly for use and reproducibility.

* arXiv admin note: substantial text overlap with arXiv:2312.03730

Via

Access Paper or Ask Questions

FakeWatch ElectionShield: A Benchmarking Framework to Detect Fake News for Credible US Elections

Nov 27, 2023

Tahniat Khan, Mizanur Rahman, Veronica Chatrath, Oluwanifemi Bamgbose, Shaina Raza

Abstract:In today's technologically driven world, the spread of fake news, particularly during crucial events such as elections, presents an increasing challenge to the integrity of information. To address this challenge, we introduce FakeWatch ElectionShield, an innovative framework carefully designed to detect fake news. We have created a novel dataset of North American election-related news articles through a blend of advanced language models (LMs) and thorough human verification, for precision and relevance. We propose a model hub of LMs for identifying fake news. Our goal is to provide the research community with adaptable and accurate classification models in recognizing the dynamic nature of misinformation. Extensive evaluation of fake news classifiers on our dataset and a benchmark dataset shows our that while state-of-the-art LMs slightly outperform the traditional ML models, classical models are still competitive with their balance of accuracy, explainability, and computational efficiency. This research sets the foundation for future studies to address misinformation related to elections.

* under review

Via

Access Paper or Ask Questions

Design-Inclusive Language Models for Responsible Information Access

Oct 20, 2023

Veronica Chatrath, Oluwanifemi Bamgbose, Shaina Raza

Figure 1 for Design-Inclusive Language Models for Responsible Information Access

Figure 2 for Design-Inclusive Language Models for Responsible Information Access

Figure 3 for Design-Inclusive Language Models for Responsible Information Access

Figure 4 for Design-Inclusive Language Models for Responsible Information Access

Abstract:As the use of large language models (LLMs) increases for everyday tasks, appropriate safeguards must be in place to ensure unbiased and safe output. Recent events highlight ethical concerns around conventionally trained LLMs, leading to overall unsafe user experiences. This motivates the need for responsible LLMs that are trained fairly, transparent to the public, and regularly monitored after deployment. In this work, we introduce the "Responsible Development of Language Models (ReDev)" framework to foster the development of fair, safe, and robust LLMs for all users. We also present a test suite of unique prompt types to assess LLMs on the aforementioned elements, ensuring all generated responses are non-harmful and free from biased content. Outputs from four state-of-the-art LLMs, OPT, GPT-3.5, GPT-4, and LLaMA-2, are evaluated by our test suite, highlighting the importance of considering fairness, safety, and robustness at every stage of the machine learning pipeline, including data curation, training, and post-deployment.

Via

Access Paper or Ask Questions

Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis

Oct 14, 2023

Shaina Raza, Oluwanifemi Bamgbose, Veronica Chatrath, Shardul Ghuge, Yan Sidyakin, Abdullah Y Muaad

Figure 1 for Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis

Figure 2 for Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis

Figure 3 for Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis

Figure 4 for Unlocking Bias Detection: Leveraging Transformer-Based Models for Content Analysis

Abstract:Bias detection in text is imperative due to its role in reinforcing negative stereotypes, disseminating misinformation, and influencing decisions. Current language models often fall short in generalizing beyond their training sets. In response, we introduce the Contextualized Bi-Directional Dual Transformer (CBDT) Classifier. This novel architecture utilizes two synergistic transformer networks: the Context Transformer and the Entity Transformer, aiming for enhanced bias detection. Our dataset preparation follows the FAIR principles, ensuring ethical data usage. Through rigorous testing on various datasets, CBDT showcases its ability in distinguishing biased from neutral statements, while also pinpointing exact biased lexemes. Our approach outperforms existing methods, achieving a 2-4\% increase over benchmark performances. This opens avenues for adapting the CBDT model across diverse linguistic and cultural landscapes.

* UNDER REVIEW

Via

Access Paper or Ask Questions

POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Jul 02, 2023

Jingxing Qian, Veronica Chatrath, James Servos, Aaron Mavrinac, Wolfram Burgard, Steven L. Waslander, Angela P. Schoellig

Figure 1 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Figure 2 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Figure 3 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Figure 4 for POV-SLAM: Probabilistic Object-Aware Variational SLAM in Semi-Static Environments

Abstract:Simultaneous localization and mapping (SLAM) in slowly varying scenes is important for long-term robot task completion. Failing to detect scene changes may lead to inaccurate maps and, ultimately, lost robots. Classical SLAM algorithms assume static scenes, and recent works take dynamics into account, but require scene changes to be observed in consecutive frames. Semi-static scenes, wherein objects appear, disappear, or move slowly over time, are often overlooked, yet are critical for long-term operation. We propose an object-aware, factor-graph SLAM framework that tracks and reconstructs semi-static object-level changes. Our novel variational expectation-maximization strategy is used to optimize factor graphs involving a Gaussian-Uniform bimodal measurement likelihood for potentially-changing objects. We evaluate our approach alongside the state-of-the-art SLAM solutions in simulation and on our novel real-world SLAM dataset captured in a warehouse over four months. Our method improves the robustness of localization in the presence of semi-static changes, providing object-level reasoning about the scene.

* Published in Robotics: Science and Systems (RSS) 2023

Via

Access Paper or Ask Questions