Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jaspreet Singh

Exploring the Effect of Robotic Embodiment and Empathetic Tone of LLMs on Empathy Elicitation

Mar 26, 2025

Liza Darwesh, Jaspreet Singh, Marin Marian, Eduard Alexa, Koen Hindriks, Kim Baraka

Abstract:This study investigates the elicitation of empathy toward a third party through interaction with social agents. Participants engaged with either a physical robot or a voice-enabled chatbot, both driven by a large language model (LLM) programmed to exhibit either an empathetic tone or remain neutral. The interaction is focused on a fictional character, Katie Banks, who is in a challenging situation and in need of financial donations. The willingness to help Katie, measured by the number of hours participants were willing to volunteer, along with their perceptions of the agent, were assessed for 60 participants. Results indicate that neither robotic embodiment nor empathetic tone significantly influenced participants' willingness to volunteer. While the LLM effectively simulated human empathy, fostering genuine empathetic responses in participants proved challenging.

* Proceedings of the International Conference on Social Robotics (ICSR 2024), Springer, 2025, pp. 1-11
* *Liza Darwesh, Jaspreet Singh, Marin Marian, and Eduard Alexa contributed equally to this work.*

Via

Access Paper or Ask Questions

Data Augmentation for Sample Efficient and Robust Document Ranking

Nov 26, 2023

Abhijit Anand, Jurek Leonhardt, Jaspreet Singh, Koustav Rudra, Avishek Anand

Abstract:Contextual ranking models have delivered impressive performance improvements over classical models in the document ranking task. However, these highly over-parameterized models tend to be data-hungry and require large amounts of data even for fine-tuning. In this paper, we propose data-augmentation methods for effective and robust ranking performance. One of the key benefits of using data augmentation is in achieving sample efficiency or learning effectively when we have only a small amount of training data. We propose supervised and unsupervised data augmentation schemes by creating training data using parts of the relevant documents in the query-document pairs. We then adapt a family of contrastive losses for the document ranking task that can exploit the augmented data to learn an effective ranking model. Our extensive experiments on subsets of the MS MARCO and TREC-DL test sets show that data augmentation, along with the ranking-adapted contrastive losses, results in performance improvements under most dataset sizes. Apart from sample efficiency, we conclusively show that data augmentation results in robust models when transferred to out-of-domain benchmarks. Our performance improvements in in-domain and more prominently in out-of-domain benchmarks show that augmentation regularizes the ranking model and improves its robustness and generalization capability.

Via

Access Paper or Ask Questions

Earthquake Magnitude and b value prediction model using Extreme Learning Machine

Jan 23, 2023

Gunbir Singh Baveja, Jaspreet Singh

Abstract:Earthquake prediction has been a challenging research area for many decades, where the future occurrence of this highly uncertain calamity is predicted. In this paper, several parametric and non-parametric features were calculated, where the non-parametric features were calculated using the parametric features. $8$ seismic features were calculated using Gutenberg-Richter law, the total recurrence, and the seismic energy release. Additionally, criterions such as Maximum Relevance and Maximum Redundancy were applied to choose the pertinent features. These features along with others were used as input for an Extreme Learning Machine (ELM) Regression Model. Magnitude and time data of $5$ decades from the Assam-Guwahati region were used to create this model for magnitude prediction. The Testing Accuracy and Testing Speed were computed taking the Root Mean Squared Error (RMSE) as the parameter for evaluating the mode. As confirmed by the results, ELM shows better scalability with much faster training and testing speed (up to a thousand times faster) than traditional Support Vector Machines. The testing RMSE came out to be around $0.097$. To further test the model's robustness -- magnitude-time data from California was used to calculate the seismic indicators which were then fed into an ELM and then tested on the Assam-Guwahati region. The model proves to be robust and can be implemented in early warning systems as it continues to be a major part of Disaster Response and management.

* 11 pages, 13 figures, 2 tables

Via

Access Paper or Ask Questions

Learning Invariant Representations for Equivariant Neural Networks Using Orthogonal Moments

Sep 22, 2022

Jaspreet Singh, Chandan Singh

Figure 1 for Learning Invariant Representations for Equivariant Neural Networks Using Orthogonal Moments

Figure 2 for Learning Invariant Representations for Equivariant Neural Networks Using Orthogonal Moments

Figure 3 for Learning Invariant Representations for Equivariant Neural Networks Using Orthogonal Moments

Figure 4 for Learning Invariant Representations for Equivariant Neural Networks Using Orthogonal Moments

Abstract:The convolutional layers of standard convolutional neural networks (CNNs) are equivariant to translation. However, the convolution and fully-connected layers are not equivariant or invariant to other affine geometric transformations. Recently, a new class of CNNs is proposed in which the conventional layers of CNNs are replaced with equivariant convolution, pooling, and batch-normalization layers. The final classification layer in equivariant neural networks is invariant to different affine geometric transformations such as rotation, reflection and translation, and the scalar value is obtained by either eliminating the spatial dimensions of filter responses using convolution and down-sampling throughout the network or average is taken over the filter responses. In this work, we propose to integrate the orthogonal moments which gives the high-order statistics of the function as an effective means for encoding global invariance with respect to rotation, reflection and translation in fully-connected layers. As a result, the intermediate layers of the network become equivariant while the classification layer becomes invariant. The most widely used Zernike, pseudo-Zernike and orthogonal Fourier-Mellin moments are considered for this purpose. The effectiveness of the proposed work is evaluated by integrating the invariant transition and fully-connected layer in the architecture of group-equivariant CNNs (G-CNNs) on rotated MNIST and CIFAR10 datasets.

* International Joint Conference on Neural Networks (IJCNN), 2022

Via

Access Paper or Ask Questions

Towards Axiomatic Explanations for Neural Ranking Models

Jul 11, 2021

Michael Völske, Alexander Bondarenko, Maik Fröbe, Matthias Hagen, Benno Stein, Jaspreet Singh, Avishek Anand

Figure 1 for Towards Axiomatic Explanations for Neural Ranking Models

Figure 2 for Towards Axiomatic Explanations for Neural Ranking Models

Figure 3 for Towards Axiomatic Explanations for Neural Ranking Models

Figure 4 for Towards Axiomatic Explanations for Neural Ranking Models

Abstract:Recently, neural networks have been successfully employed to improve upon state-of-the-art performance in ad-hoc retrieval tasks via machine-learned ranking functions. While neural retrieval models grow in complexity and impact, little is understood about their correspondence with well-studied IR principles. Recent work on interpretability in machine learning has provided tools and techniques to understand neural models in general, yet there has been little progress towards explaining ranking models. We investigate whether one can explain the behavior of neural ranking models in terms of their congruence with well understood principles of document ranking by using established theories from axiomatic IR. Axiomatic analysis of information retrieval models has formalized a set of constraints on ranking decisions that reasonable retrieval models should fulfill. We operationalize this axiomatic thinking to reproduce rankings based on combinations of elementary constraints. This allows us to investigate to what extent the ranking decisions of neural rankers can be explained in terms of retrieval axioms, and which axioms apply in which situations. Our experimental study considers a comprehensive set of axioms over several representative neural rankers. While the existing axioms can already explain the particularly confident ranking decisions rather well, future work should extend the axiom set to also cover the other still "unexplainable" neural IR rank decisions.

* 10 pages, 2 figures. Published in the proceedings of ICTIR 2021

Via

Access Paper or Ask Questions

BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Jun 05, 2021

Jonas Wallat, Jaspreet Singh, Avishek Anand

Figure 1 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Figure 2 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Figure 3 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Figure 4 for BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Abstract:Probing complex language models has recently revealed several insights into linguistic and semantic patterns found in the learned representations. In this article, we probe BERT specifically to understand and measure the relational knowledge it captures in its parametric memory. While probing for linguistic understanding is commonly applied to all layers of BERT as well as fine-tuned models, this has not been done for factual knowledge. We utilize existing knowledge base completion tasks (LAMA) to probe every layer of pre-trained as well as fine-tuned BERT models(ranking, question answering, NER). Our findings show that knowledge is not just contained in BERT's final layers. Intermediate layers contribute a significant amount (17-60%) to the total knowledge found. Probing intermediate layers also reveals how different types of knowledge emerge at varying rates. When BERT is fine-tuned, relational knowledge is forgotten. The extent of forgetting is impacted by the fine-tuning objective and the training data. We found that ranking models forget the least and retain more knowledge in their final layer compared to masked language modeling and question-answering. However, masked language modeling performed the best at acquiring new knowledge from the training data. When it comes to learning facts, we found that capacity and fact density are key factors. We hope this initial work will spur further research into understanding the parametric memory of language models and the effect of training objectives on factual knowledge. The code to repeat the experiments is publicly available on GitHub.

* arXiv admin note: substantial text overlap with arXiv:2010.09313

Via

Access Paper or Ask Questions

Dissonance Between Human and Machine Understanding

Jan 18, 2021

Zijian Zhang, Jaspreet Singh, Ujwal Gadiraju, Avishek Anand

Figure 1 for Dissonance Between Human and Machine Understanding

Figure 2 for Dissonance Between Human and Machine Understanding

Figure 3 for Dissonance Between Human and Machine Understanding

Figure 4 for Dissonance Between Human and Machine Understanding

Abstract:Complex machine learning models are deployed in several critical domains including healthcare and autonomous vehicles nowadays, albeit as functional black boxes. Consequently, there has been a recent surge in interpreting decisions of such complex models in order to explain their actions to humans. Models that correspond to human interpretation of a task are more desirable in certain contexts and can help attribute liability, build trust, expose biases and in turn build better models. It is, therefore, crucial to understand how and which models conform to human understanding of tasks. In this paper, we present a large-scale crowdsourcing study that reveals and quantifies the dissonance between human and machine understanding, through the lens of an image classification task. In particular, we seek to answer the following questions: Which (well-performing) complex ML models are closer to humans in their use of features to make accurate predictions? How does task difficulty affect the feature selection capability of machines in comparison to humans? Are humans consistently better at selecting features that make image recognition more accurate? Our findings have important implications on human-machine collaboration, considering that a long term goal in the field of artificial intelligence is to make machines capable of learning and reasoning like humans.

* [J]. Proceedings of the ACM on Human-Computer Interaction, 2019, 3(CSCW): 1-23
* 23 pages, 5 figures

Via

Access Paper or Ask Questions

Valid Explanations for Learning to Rank Models

May 17, 2020

Jaspreet Singh, Zhenye Wang, Megha Khosla, Avishek Anand

Figure 1 for Valid Explanations for Learning to Rank Models

Figure 2 for Valid Explanations for Learning to Rank Models

Figure 3 for Valid Explanations for Learning to Rank Models

Abstract:Learning-to-rank (LTR) is a class of supervised learning techniques that apply to ranking problems dealing with a large number of features. The popularity and widespread application of LTR models in prioritizing information in a variety of domains makes their scrutability vital in today's landscape of fair and transparent learning systems. However, limited work exists that deals with interpreting the decisions of learning systems that output rankings. In this paper we propose a model agnostic local explanation method that seeks to identify a small subset of input features as explanation to a ranking decision. We introduce new notions of validity and completeness of explanations specifically for rankings, based on the presence or absence of selected features, as a way of measuring goodness. We devise a novel optimization problem to maximize validity directly and propose greedy algorithms as solutions. In extensive quantitative experiments we show that our approach outperforms other model agnostic explanation approaches across pointwise, pairwise and listwise LTR models in validity while not compromising on completeness.

Via

Access Paper or Ask Questions

AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

Dec 04, 2019

Gaurav Kumar, Rishabh Joshi, Jaspreet Singh, Promod Yenigalla

Figure 1 for AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

Figure 2 for AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

Figure 3 for AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

Figure 4 for AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue

Abstract:The problem of building a coherent and non-monotonous conversational agent with proper discourse and coverage is still an area of open research. Current architectures only take care of semantic and contextual information for a given query and fail to completely account for syntactic and external knowledge which are crucial for generating responses in a chit-chat system. To overcome this problem, we propose an end to end multi-stream deep learning architecture which learns unified embeddings for query-response pairs by leveraging contextual information from memory networks and syntactic information by incorporating Graph Convolution Networks (GCN) over their dependency parse. A stream of this network also utilizes transfer learning by pre-training a bidirectional transformer to extract semantic representation for each input sentence and incorporates external knowledge through the the neighborhood of the entities from a Knowledge Base (KB). We benchmark these embeddings on next sentence prediction task and significantly improve upon the existing techniques. Furthermore, we use AMUSED to represent query and responses along with its context to develop a retrieval based conversational agent which has been validated by expert linguists to have comprehensive engagement with humans.

Via

Access Paper or Ask Questions

Toxicity Prediction by Multimodal Deep Learning

Jul 19, 2019

Abdul Karim, Jaspreet Singh, Avinash Mishra, Abdollah Dehzangi, M. A. Hakim Newton, Abdul Sattar

Figure 1 for Toxicity Prediction by Multimodal Deep Learning

Figure 2 for Toxicity Prediction by Multimodal Deep Learning

Figure 3 for Toxicity Prediction by Multimodal Deep Learning

Figure 4 for Toxicity Prediction by Multimodal Deep Learning

Abstract:Prediction of toxicity levels of chemical compounds is an important issue in Quantitative Structure-Activity Relationship (QSAR) modeling. Although toxicity prediction has achieved significant progress in recent times through deep learning, prediction accuracy levels obtained by even very recent methods are not yet very high. We propose a multimodal deep learning method using multiple heterogeneous neural network types and data representations. We represent chemical compounds by strings, images, and numerical features. We train fully connected, convolutional, and recurrent neural networks and their ensembles. Each data representation or neural network type has its own strengths and weaknesses. Our motivation is to obtain a collective performance that could go beyond individual performance of each data representation or each neural network type. On a standard toxicity benchmark, our proposed method obtains significantly better accuracy levels than that by the state-of-the-art toxicity prediction methods.

* 2019 Pacific Rim Knowledge Acquisition Workshop
* Preprint Version

Via

Access Paper or Ask Questions