Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Animesh Nighojkar

Giving AI Personalities Leads to More Human-Like Reasoning

Feb 21, 2025

Animesh Nighojkar, Bekhzodbek Moydinboyev, My Duong, John Licato

Abstract:In computational cognitive modeling, capturing the full spectrum of human judgment and decision-making processes, beyond just optimal behaviors, is a significant challenge. This study explores whether Large Language Models (LLMs) can emulate the breadth of human reasoning by predicting both intuitive, fast System 1 and deliberate, slow System 2 processes. We investigate the potential of AI to mimic diverse reasoning behaviors across a human population, addressing what we call the "full reasoning spectrum problem". We designed reasoning tasks using a novel generalization of the Natural Language Inference (NLI) format to evaluate LLMs' ability to replicate human reasoning. The questions were crafted to elicit both System 1 and System 2 responses. Human responses were collected through crowd-sourcing and the entire distribution was modeled, rather than just the majority of the answers. We used personality-based prompting inspired by the Big Five personality model to elicit AI responses reflecting specific personality traits, capturing the diversity of human reasoning, and exploring how personality traits influence LLM outputs. Combined with genetic algorithms to optimize the weighting of these prompts, this method was tested alongside traditional machine learning models. The results show that LLMs can mimic human response distributions, with open-source models like Llama and Mistral outperforming proprietary GPT models. Personality-based prompting, especially when optimized with genetic algorithms, significantly enhanced LLMs' ability to predict human response distributions, suggesting that capturing suboptimal, naturalistic reasoning may require modeling techniques incorporating diverse reasoning styles and psychological profiles. The study concludes that personality-based prompting combined with genetic algorithms is promising for enhancing AI's 'human-ness' in reasoning.

Via

Access Paper or Ask Questions

No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference

Jun 16, 2023

Animesh Nighojkar, Antonio Laverghetta Jr., John Licato

Abstract:Natural Language Inference (NLI) has been a cornerstone task in evaluating language models' inferential reasoning capabilities. However, the standard three-way classification scheme used in NLI has well-known shortcomings in evaluating models' ability to capture the nuances of natural human reasoning. In this paper, we argue that the operationalization of the neutral label in current NLI datasets has low validity, is interpreted inconsistently, and that at least one important sense of neutrality is often ignored. We uncover the detrimental impact of these shortcomings, which in some cases leads to annotation datasets that actually decrease performance on downstream tasks. We compare approaches of handling annotator disagreement and identify flaws in a recent NLI dataset that designs an annotator study based on a problematic operationalization. Our findings highlight the need for a more refined evaluation framework for NLI, and we hope to spark further discussion and action in the NLP community.

* Appearing at the 17th Linguistic Annotation Workshop at ACL 2023

Via

Access Paper or Ask Questions

Cognitive Modeling of Semantic Fluency Using Transformers

Aug 20, 2022

Animesh Nighojkar, Anna Khlyzova, John Licato

Figure 1 for Cognitive Modeling of Semantic Fluency Using Transformers

Figure 2 for Cognitive Modeling of Semantic Fluency Using Transformers

Figure 3 for Cognitive Modeling of Semantic Fluency Using Transformers

Figure 4 for Cognitive Modeling of Semantic Fluency Using Transformers

Abstract:Can deep language models be explanatory models of human cognition? If so, what are their limits? In order to explore this question, we propose an approach called hyperparameter hypothesization that uses predictive hyperparameter tuning in order to find individuating descriptors of cognitive-behavioral profiles. We take the first step in this approach by predicting human performance in the semantic fluency task (SFT), a well-studied task in cognitive science that has never before been modeled using transformer-based language models (TLMs). In our task setup, we compare several approaches to predicting which word an individual performing SFT will utter next. We report preliminary evidence suggesting that, despite obvious implementational differences in how people and TLMs learn and use language, TLMs can be used to identify individual differences in human fluency task behaviors better than existing computational models, and may offer insights into human memory retrieval strategies -- cognitive process not typically considered to be the kinds of things TLMs can model. Finally, we discuss the implications of this work for cognitive modeling of knowledge representations.

* Cognitive Aspects of Knowledge Representation workshop at IJCAI-ECAI 2022

Via

Access Paper or Ask Questions

Predicting Human Psychometric Properties Using Computational Language Models

May 12, 2022

Antonio Laverghetta Jr., Animesh Nighojkar, Jamshidbek Mirzakhalov, John Licato

Figure 1 for Predicting Human Psychometric Properties Using Computational Language Models

Figure 2 for Predicting Human Psychometric Properties Using Computational Language Models

Figure 3 for Predicting Human Psychometric Properties Using Computational Language Models

Figure 4 for Predicting Human Psychometric Properties Using Computational Language Models

Abstract:Transformer-based language models (LMs) continue to achieve state-of-the-art performance on natural language processing (NLP) benchmarks, including tasks designed to mimic human-inspired "commonsense" competencies. To better understand the degree to which LMs can be said to have certain linguistic reasoning skills, researchers are beginning to adapt the tools and concepts from psychometrics. But to what extent can benefits flow in the other direction? In other words, can LMs be of use in predicting the psychometric properties of test items, when those items are given to human participants? If so, the benefit for psychometric practitioners is enormous, as it can reduce the need for multiple rounds of empirical testing. We gather responses from numerous human participants and LMs (transformer- and non-transformer-based) on a broad diagnostic test of linguistic competencies. We then use the human responses to calculate standard psychometric properties of the items in the diagnostic test, using the human responses and the LM responses separately. We then determine how well these two sets of predictions correlate. We find that transformer-based LMs predict the human psychometric data consistently well across most categories, suggesting that they can be used to gather human-like psychometric data without the need for extensive human trials.

* To appear in Quantitative Psychology, The 86th Annual Meeting of the Psychometric Society, Virtual. arXiv admin note: substantial text overlap with arXiv:2106.06849

Via

Access Paper or Ask Questions

Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Jun 14, 2021

Animesh Nighojkar, John Licato

Figure 1 for Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Figure 2 for Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Figure 3 for Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Figure 4 for Improving Paraphrase Detection with the Adversarial Paraphrasing Task

Abstract:If two sentences have the same meaning, it should follow that they are equivalent in their inferential properties, i.e., each sentence should textually entail the other. However, many paraphrase datasets currently in widespread use rely on a sense of paraphrase based on word overlap and syntax. Can we teach them instead to identify paraphrases in a way that draws on the inferential properties of the sentences, and is not over-reliant on lexical and syntactic similarities of a sentence pair? We apply the adversarial paradigm to this question, and introduce a new adversarial method of dataset creation for paraphrase identification: the Adversarial Paraphrasing Task (APT), which asks participants to generate semantically equivalent (in the sense of mutually implicative) but lexically and syntactically disparate paraphrases. These sentence pairs can then be used both to test paraphrase identification models (which get barely random accuracy) and then improve their performance. To accelerate dataset generation, we explore automation of APT using T5, and show that the resulting dataset also improves accuracy. We discuss implications for paraphrase detection and release our dataset in the hope of making paraphrase detection models better able to detect sentence-level meaning equivalence.

Via

Access Paper or Ask Questions

Can Transformer Language Models Predict Psychometric Properties?

Jun 12, 2021

Antonio Laverghetta Jr., Animesh Nighojkar, Jamshidbek Mirzakhalov, John Licato

Figure 1 for Can Transformer Language Models Predict Psychometric Properties?

Figure 2 for Can Transformer Language Models Predict Psychometric Properties?

Figure 3 for Can Transformer Language Models Predict Psychometric Properties?

Figure 4 for Can Transformer Language Models Predict Psychometric Properties?

Abstract:Transformer-based language models (LMs) continue to advance state-of-the-art performance on NLP benchmark tasks, including tasks designed to mimic human-inspired "commonsense" competencies. To better understand the degree to which LMs can be said to have certain linguistic reasoning skills, researchers are beginning to adapt the tools and concepts of the field of psychometrics. But to what extent can the benefits flow in the other direction? I.e., can LMs be of use in predicting what the psychometric properties of test items will be when those items are given to human participants? We gather responses from numerous human participants and LMs (transformer and non-transformer-based) on a broad diagnostic test of linguistic competencies. We then use the responses to calculate standard psychometric properties of the items in the diagnostic test, using the human responses and the LM responses separately. We then determine how well these two sets of predictions match. We find cases in which transformer-based LMs predict psychometric properties consistently well in certain categories but consistently poorly in others, thus providing new insights into fundamental similarities and differences between human and LM reasoning.

* Proceedings of the 10th Joint Conference on Lexical and Computational Semantics (*SEM 2021)

Via

Access Paper or Ask Questions

Probing the Natural Language Inference Task with Automated Reasoning Tools

May 06, 2020

Zaid Marji, Animesh Nighojkar, John Licato

Figure 1 for Probing the Natural Language Inference Task with Automated Reasoning Tools

Figure 2 for Probing the Natural Language Inference Task with Automated Reasoning Tools

Abstract:The Natural Language Inference (NLI) task is an important task in modern NLP, as it asks a broad question to which many other tasks may be reducible: Given a pair of sentences, does the first entail the second? Although the state-of-the-art on current benchmark datasets for NLI are deep learning-based, it is worthwhile to use other techniques to examine the logical structure of the NLI task. We do so by testing how well a machine-oriented controlled natural language (Attempto Controlled English) can be used to parse NLI sentences, and how well automated theorem provers can reason over the resulting formulae. To improve performance, we develop a set of syntactic and semantic transformation rules. We report their performance, and discuss implications for NLI and logic-based NLP.

* Accepted to Proceedings of The 33rd International Florida Artificial Intelligence Research Society Conference (FLAIRS-33, 2020)

Via

Access Paper or Ask Questions