Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rahul Mishra

HalluCounter: Reference-free LLM Hallucination Detection in the Wild!

Mar 06, 2025

Ashok Urlana, Gopichand Kanumolu, Charaka Vinayak Kumar, Bala Mallikarjunarao Garlapati, Rahul Mishra

Abstract:Response consistency-based, reference-free hallucination detection (RFHD) methods do not depend on internal model states, such as generation probabilities or gradients, which Grey-box models typically rely on but are inaccessible in closed-source LLMs. However, their inability to capture query-response alignment patterns often results in lower detection accuracy. Additionally, the lack of large-scale benchmark datasets spanning diverse domains remains a challenge, as most existing datasets are limited in size and scope. To this end, we propose HalluCounter, a novel reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns. This enables the training of a classifier that detects hallucinations and provides a confidence score and an optimal response for user queries. Furthermore, we introduce HalluCounterEval, a benchmark dataset comprising both synthetically generated and human-curated samples across multiple domains. Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90\% average confidence in hallucination detection across datasets.

* 30 pages, 4 figures

Via

Access Paper or Ask Questions

One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization

Nov 02, 2024

Tathagato Roy, Rahul Mishra

Figure 1 for One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization

Figure 2 for One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization

Figure 3 for One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization

Figure 4 for One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization

Abstract:Text summarization is a well-established task within the natural language processing (NLP) community. However, the focus on controllable summarization tailored to user requirements is gaining traction only recently. While several efforts explore controllability in text summarization, the investigation of Multi-Attribute Controllable Summarization (MACS) remains limited. This work addresses this gap by examining the MACS task through the lens of large language models (LLMs), using various learning paradigms, particularly low-rank adapters. We experiment with different popular adapter fine-tuning strategies to assess the effectiveness of the resulting models in retaining cues and patterns associated with multiple controllable attributes. Additionally, we propose and evaluate a novel hierarchical adapter fusion technique to integrate learnings from two distinct controllable attributes. Subsquently, we present our findings, discuss the challenges encountered, and suggest potential avenues for advancing the MACS task.

Via

Access Paper or Ask Questions

KTCR: Improving Implicit Hate Detection with Knowledge Transfer driven Concept Refinement

Oct 20, 2024

Samarth Garg, Vivek Hruday Kavuri, Gargi Shroff, Rahul Mishra

Abstract:The constant shifts in social and political contexts, driven by emerging social movements and political events, lead to new forms of hate content and previously unrecognized hate patterns that machine learning models may not have captured. Some recent literature proposes the data augmentation-based techniques to enrich existing hate datasets by incorporating samples that reveal new implicit hate patterns. This approach aims to improve the model's performance on out-of-domain implicit hate instances. It is observed, that further addition of more samples for augmentation results in the decrease of the performance of the model. In this work, we propose a Knowledge Transfer-driven Concept Refinement method that distills and refines the concepts related to implicit hate samples through novel prototype alignment and concept losses, alongside data augmentation based on concept activation vectors. Experiments with several publicly available datasets show that incorporating additional implicit samples reflecting new hate patterns through concept refinement enhances the model's performance, surpassing baseline results while maintaining cross-dataset generalization capabilities.\footnote{DISCLAIMER: This paper contains explicit statements that are potentially offensive.}

* 11 pages, 4 figures, 2 algorithms, 5 tables

Via

Access Paper or Ask Questions

SceneGraMMi: Scene Graph-boosted Hybrid-fusion for Multi-Modal Misinformation Veracity Prediction

Oct 20, 2024

Swarang Joshi, Siddharth Mavani, Joel Alex, Arnav Negi, Rahul Mishra, Ponnurangam Kumaraguru

Figure 1 for SceneGraMMi: Scene Graph-boosted Hybrid-fusion for Multi-Modal Misinformation Veracity Prediction

Figure 2 for SceneGraMMi: Scene Graph-boosted Hybrid-fusion for Multi-Modal Misinformation Veracity Prediction

Figure 3 for SceneGraMMi: Scene Graph-boosted Hybrid-fusion for Multi-Modal Misinformation Veracity Prediction

Figure 4 for SceneGraMMi: Scene Graph-boosted Hybrid-fusion for Multi-Modal Misinformation Veracity Prediction

Abstract:Misinformation undermines individual knowledge and affects broader societal narratives. Despite growing interest in the research community in multi-modal misinformation detection, existing methods exhibit limitations in capturing semantic cues, key regions, and cross-modal similarities within multi-modal datasets. We propose SceneGraMMi, a Scene Graph-boosted Hybrid-fusion approach for Multi-modal Misinformation veracity prediction, which integrates scene graphs across different modalities to improve detection performance. Experimental results across four benchmark datasets show that SceneGraMMi consistently outperforms state-of-the-art methods. In a comprehensive ablation study, we highlight the contribution of each component, while Shapley values are employed to examine the explainability of the model's decision-making process.

Via

Access Paper or Ask Questions

DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph

Oct 18, 2024

Maitreya Prafulla Chitale, Uday Bindal, Rajakrishnan Rajkumar, Rahul Mishra

Figure 1 for DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph

Figure 2 for DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph

Figure 3 for DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph

Figure 4 for DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph

Abstract:Summarizing movie screenplays presents a unique set of challenges compared to standard document summarization. Screenplays are not only lengthy, but also feature a complex interplay of characters, dialogues, and scenes, with numerous direct and subtle relationships and contextual nuances that are difficult for machine learning models to accurately capture and comprehend. Recent attempts at screenplay summarization focus on fine-tuning transformer-based pre-trained models, but these models often fall short in capturing long-term dependencies and latent relationships, and frequently encounter the "lost in the middle" issue. To address these challenges, we introduce DiscoGraMS, a novel resource that represents movie scripts as a movie character-aware discourse graph (CaD Graph). This approach is well-suited for various downstream tasks, such as summarization, question-answering, and salience detection. The model aims to preserve all salient information, offering a more comprehensive and faithful representation of the screenplay's content. We further explore a baseline method that combines the CaD Graph with the corresponding movie script through a late fusion of graph and text modalities, and we present very initial promising results.

Via

Access Paper or Ask Questions

Utilizing Transfer Learning and pre-trained Models for Effective Forest Fire Detection: A Case Study of Uttarakhand

Oct 09, 2024

Hari Prabhat Gupta, Rahul Mishra

Figure 1 for Utilizing Transfer Learning and pre-trained Models for Effective Forest Fire Detection: A Case Study of Uttarakhand

Figure 2 for Utilizing Transfer Learning and pre-trained Models for Effective Forest Fire Detection: A Case Study of Uttarakhand

Figure 3 for Utilizing Transfer Learning and pre-trained Models for Effective Forest Fire Detection: A Case Study of Uttarakhand

Figure 4 for Utilizing Transfer Learning and pre-trained Models for Effective Forest Fire Detection: A Case Study of Uttarakhand

Abstract:Forest fires pose a significant threat to the environment, human life, and property. Early detection and response are crucial to mitigating the impact of these disasters. However, traditional forest fire detection methods are often hindered by our reliability on manual observation and satellite imagery with low spatial resolution. This paper emphasizes the role of transfer learning in enhancing forest fire detection in India, particularly in overcoming data collection challenges and improving model accuracy across various regions. We compare traditional learning methods with transfer learning, focusing on the unique challenges posed by regional differences in terrain, climate, and vegetation. Transfer learning can be categorized into several types based on the similarity between the source and target tasks, as well as the type of knowledge transferred. One key method is utilizing pre-trained models for efficient transfer learning, which significantly reduces the need for extensive labeled data. We outline the transfer learning process, demonstrating how researchers can adapt pre-trained models like MobileNetV2 for specific tasks such as forest fire detection. Finally, we present experimental results from training and evaluating a deep learning model using the Uttarakhand forest fire dataset, showcasing the effectiveness of transfer learning in this context.

* 15 pages, 6 figures

Via

Access Paper or Ask Questions

Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks

Sep 10, 2024

Debjyoti Mondal, Rahul Mishra, Chandan Pandey

Figure 1 for Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks

Figure 2 for Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks

Figure 3 for Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks

Figure 4 for Seg-HGNN: Unsupervised and Light-Weight Image Segmentation with Hyperbolic Graph Neural Networks

Abstract:Image analysis in the euclidean space through linear hyperspaces is well studied. However, in the quest for more effective image representations, we turn to hyperbolic manifolds. They provide a compelling alternative to capture complex hierarchical relationships in images with remarkably small dimensionality. To demonstrate hyperbolic embeddings' competence, we introduce a light-weight hyperbolic graph neural network for image segmentation, encompassing patch-level features in a very small embedding size. Our solution, Seg-HGNN, surpasses the current best unsupervised method by 2.5\%, 4\% on VOC-07, VOC-12 for localization, and by 0.8\%, 1.3\% on CUB-200, ECSSD for segmentation, respectively. With less than 7.5k trainable parameters, Seg-HGNN delivers effective and fast ($\approx 2$ images/second) results on very standard GPUs like the GTX1650. This empirical evaluation presents compelling evidence of the efficacy and potential of hyperbolic representations for vision tasks.

* BMVC 2024

Via

Access Paper or Ask Questions

No Size Fits All: The Perils and Pitfalls of Leveraging LLMs Vary with Company Size

Jul 21, 2024

Ashok Urlana, Charaka Vinayak Kumar, Bala Mallikarjunarao Garlapati, Ajeet Kumar Singh, Rahul Mishra

Abstract:Large language models (LLMs) are playing a pivotal role in deploying strategic use cases across a range of organizations, from large pan-continental companies to emerging startups. The issues and challenges involved in the successful utilization of LLMs can vary significantly depending on the size of the organization. It is important to study and discuss these pertinent issues of LLM adaptation with a focus on the scale of the industrial concerns and brainstorm possible solutions and prospective directions. Such a study has not been prominently featured in the current research literature. In this study, we adopt a threefold strategy: first, we conduct a case study with industry practitioners to formulate the key research questions; second, we examine existing industrial publications to address these questions; and finally, we provide a practical guide for industries to utilize LLMs more efficiently.

* 17 pages, 3 figures

Via

Access Paper or Ask Questions

Exploring News Summarization and Enrichment in a Highly Resource-Scarce Indian Language: A Case Study of Mizo

Apr 25, 2024

Abhinaba Bala, Ashok Urlana, Rahul Mishra, Parameswari Krishnamurthy

Abstract:Obtaining sufficient information in one's mother tongue is crucial for satisfying the information needs of the users. While high-resource languages have abundant online resources, the situation is less than ideal for very low-resource languages. Moreover, the insufficient reporting of vital national and international events continues to be a worry, especially in languages with scarce resources, like \textbf{Mizo}. In this paper, we conduct a study to investigate the effectiveness of a simple methodology designed to generate a holistic summary for Mizo news articles, which leverages English-language news to supplement and enhance the information related to the corresponding news events. Furthermore, we make available 500 Mizo news articles and corresponding enriched holistic summaries. Human evaluation confirms that our approach significantly enhances the information coverage of Mizo news articles. The mizo dataset and code can be accessed at \url{https://github.com/barvin04/mizo_enrichment

* Accepted at LREC-COLING2024 WILDRE Workshop

Via

Access Paper or Ask Questions

LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers

Mar 22, 2024

Abdur Rahman Bin Md Faizullah, Ashok Urlana, Rahul Mishra

Abstract:Examining limitations is a crucial step in the scholarly research reviewing process, revealing aspects where a study might lack decisiveness or require enhancement. This aids readers in considering broader implications for further research. In this article, we present a novel and challenging task of Suggestive Limitation Generation (SLG) for research papers. We compile a dataset called LimGen, encompassing 4068 research papers and their associated limitations from the ACL anthology. We investigate several approaches to harness large language models (LLMs) for producing suggestive limitations, by thoroughly examining the related challenges, practical insights, and potential opportunities. Our LimGen dataset and code can be accessed at https://github.com/armbf/LimGen.

* 16 pages, 3 figures

Via

Access Paper or Ask Questions