Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kevin Esterling

Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models

Apr 23, 2025

Shariar Kabir, Kevin Esterling, Yue Dong

Figure 1 for Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models

Figure 2 for Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models

Figure 3 for Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models

Figure 4 for Do Words Reflect Beliefs? Evaluating Belief Depth in Large Language Models

Abstract:Large Language Models (LLMs) are increasingly shaping political discourse, yet their responses often display inconsistency when subjected to scrutiny. While prior research has primarily categorized LLM outputs as left- or right-leaning to assess their political stances, a critical question remains: Do these responses reflect genuine internal beliefs or merely surface-level alignment with training data? To address this, we propose a novel framework for evaluating belief depth by analyzing (1) argumentative consistency and (2) uncertainty quantification. We evaluate 12 LLMs on 19 economic policies from the Political Compass Test, challenging their belief stability with both supportive and opposing arguments. Our analysis reveals that LLMs exhibit topic-specific belief stability rather than a uniform ideological stance. Notably, up to 95% of left-leaning models' responses and 89% of right-leaning models' responses remain consistent under the challenge, enabling semantic entropy to achieve high accuracy (AUROC=0.78), effectively distinguishing between surface-level alignment from genuine belief. These findings call into question the assumption that LLMs maintain stable, human-like political ideologies, emphasizing the importance of conducting topic-specific reliability assessments for real-world applications.

* 20 pages, 9 figures

Via

Access Paper or Ask Questions

WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining

May 01, 2024

Arman Irani, Ju Yeon Park, Kevin Esterling, Michalis Faloutsos

Figure 1 for WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining

Figure 2 for WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining

Figure 3 for WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining

Figure 4 for WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining

Abstract:We propose WIBA, a novel framework and suite of methods that enable the comprehensive understanding of "What Is Being Argued" across contexts. Our approach develops a comprehensive framework that detects: (a) the existence, (b) the topic, and (c) the stance of an argument, correctly accounting for the logical dependence among the three tasks. Our algorithm leverages the fine-tuning and prompt-engineering of Large Language Models. We evaluate our approach and show that it performs well in all the three capabilities. First, we develop and release an Argument Detection model that can classify a piece of text as an argument with an F1 score between 79% and 86% on three different benchmark datasets. Second, we release a language model that can identify the topic being argued in a sentence, be it implicit or explicit, with an average similarity score of 71%, outperforming current naive methods by nearly 40%. Finally, we develop a method for Argument Stance Classification, and evaluate the capability of our approach, showing it achieves a classification F1 score between 71% and 78% across three diverse benchmark datasets. Our evaluation demonstrates that WIBA allows the comprehensive understanding of What Is Being Argued in large corpora across diverse contexts, which is of core interest to many applications in linguistics, communication, and social and computer science. To facilitate accessibility to the advancements outlined in this work, we release WIBA as a free open access platform (wiba.dev).

* 8 pages, 2 figures, submitted to The 16th International Conference on Advances in Social Networks Analysis and Mining (ASONAM) '24

Via

Access Paper or Ask Questions