Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Isar Nejadgholi

Gender-Neutral Machine Translation Strategies in Practice

Jun 18, 2025

Hillary Dawkins, Isar Nejadgholi, Chi-kiu Lo

Abstract:Gender-inclusive machine translation (MT) should preserve gender ambiguity in the source to avoid misgendering and representational harms. While gender ambiguity often occurs naturally in notional gender languages such as English, maintaining that gender neutrality in grammatical gender languages is a challenge. Here we assess the sensitivity of 21 MT systems to the need for gender neutrality in response to gender ambiguity in three translation directions of varying difficulty. The specific gender-neutral strategies that are observed in practice are categorized and discussed. Additionally, we examine the effect of binary gender stereotypes on the use of gender-neutral translation. In general, we report a disappointing absence of gender-neutral translations in response to gender ambiguity. However, we observe a small handful of MT systems that switch to gender neutral translation using specific strategies, depending on the target language.

* to appear at GITT 2025

Via

Access Paper or Ask Questions

WildFireCan-MMD: A Multimodal dataset for Classification of User-generated Content During Wildfires in Canada

Apr 17, 2025

Braeden Sherritt, Isar Nejadgholi, Marzieh Amini

Figure 1 for WildFireCan-MMD: A Multimodal dataset for Classification of User-generated Content During Wildfires in Canada

Figure 2 for WildFireCan-MMD: A Multimodal dataset for Classification of User-generated Content During Wildfires in Canada

Figure 3 for WildFireCan-MMD: A Multimodal dataset for Classification of User-generated Content During Wildfires in Canada

Figure 4 for WildFireCan-MMD: A Multimodal dataset for Classification of User-generated Content During Wildfires in Canada

Abstract:Rapid information access is vital during wildfires, yet traditional data sources are slow and costly. Social media offers real-time updates, but extracting relevant insights remains a challenge. We present WildFireCan-MMD, a new multimodal dataset of X posts from recent Canadian wildfires, annotated across 13 key themes. Evaluating both Vision Language Models and custom-trained classifiers, we show that while zero-shot prompting offers quick deployment, even simple trained models outperform them when labelled data is available, by up to 23%. Our findings highlight the enduring importance of tailored datasets and task-specific training. Importantly, such datasets should be localized, as disaster response requirements vary across regions and contexts.

Via

Access Paper or Ask Questions

Tackling Social Bias against the Poor: A Dataset and Taxonomy on Aporophobia

Apr 17, 2025

Georgina Curto, Svetlana Kiritchenko, Muhammad Hammad Fahim Siddiqui, Isar Nejadgholi, Kathleen C. Fraser

Abstract:Eradicating poverty is the first goal in the United Nations Sustainable Development Goals. However, aporophobia -- the societal bias against people living in poverty -- constitutes a major obstacle to designing, approving and implementing poverty-mitigation policies. This work presents an initial step towards operationalizing the concept of aporophobia to identify and track harmful beliefs and discriminative actions against poor people on social media. In close collaboration with non-profits and governmental organizations, we conduct data collection and exploration. Then we manually annotate a corpus of English tweets from five world regions for the presence of (1) direct expressions of aporophobia, and (2) statements referring to or criticizing aporophobic views or actions of others, to comprehensively characterize the social media discourse related to bias and discrimination against the poor. Based on the annotated data, we devise a taxonomy of categories of aporophobic attitudes and actions expressed through speech on social media. Finally, we train several classifiers and identify the main challenges for automatic detection of aporophobia in social networks. This work paves the way towards identifying, tracking, and mitigating aporophobic views on social media at scale.

* In Findings of the Association for Computational Linguistics: NAACL 2025

Via

Access Paper or Ask Questions

WMT24 Test Suite: Gender Resolution in Speaker-Listener Dialogue Roles

Nov 09, 2024

Hillary Dawkins, Isar Nejadgholi, Chi-kiu Lo

Abstract:We assess the difficulty of gender resolution in literary-style dialogue settings and the influence of gender stereotypes. Instances of the test suite contain spoken dialogue interleaved with external meta-context about the characters and the manner of speaking. We find that character and manner stereotypes outside of the dialogue significantly impact the gender agreement of referents within the dialogue.

Via

Access Paper or Ask Questions

Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes

Apr 18, 2024

Isar Nejadgholi, Kathleen C. Fraser, Anna Kerkhof, Svetlana Kiritchenko

Figure 1 for Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes

Figure 2 for Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes

Figure 3 for Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes

Figure 4 for Challenging Negative Gender Stereotypes: A Study on the Effectiveness of Automated Counter-Stereotypes

Abstract:Gender stereotypes are pervasive beliefs about individuals based on their gender that play a significant role in shaping societal attitudes, behaviours, and even opportunities. Recognizing the negative implications of gender stereotypes, particularly in online communications, this study investigates eleven strategies to automatically counter-act and challenge these views. We present AI-generated gender-based counter-stereotypes to (self-identified) male and female study participants and ask them to assess their offensiveness, plausibility, and potential effectiveness. The strategies of counter-facts and broadening universals (i.e., stating that anyone can have a trait regardless of group membership) emerged as the most robust approaches, while humour, perspective-taking, counter-examples, and empathy for the speaker were perceived as less effective. Also, the differences in ratings were more pronounced for stereotypes about the different targets than between the genders of the raters. Alarmingly, many AI-generated counter-stereotypes were perceived as offensive and/or implausible. Our analysis and the collected dataset offer foundational insight into counter-stereotype generation, guiding future efforts to develop strategies that effectively challenge gender stereotypes in online interactions.

* LREC-COLING2024

Via

Access Paper or Ask Questions

Projective Methods for Mitigating Gender Bias in Pre-trained Language Models

Mar 27, 2024

Hillary Dawkins, Isar Nejadgholi, Daniel Gillis, Judi McCuaig

Abstract:Mitigation of gender bias in NLP has a long history tied to debiasing static word embeddings. More recently, attention has shifted to debiasing pre-trained language models. We study to what extent the simplest projective debiasing methods, developed for word embeddings, can help when applied to BERT's internal representations. Projective methods are fast to implement, use a small number of saved parameters, and make no updates to the existing model parameters. We evaluate the efficacy of the methods in reducing both intrinsic bias, as measured by BERT's next sentence prediction task, and in mitigating observed bias in a downstream setting when fine-tuned. To this end, we also provide a critical analysis of a popular gender-bias assessment test for quantifying intrinsic bias, resulting in an enhanced test set and new bias measures. We find that projective methods can be effective at both intrinsic bias and downstream bias mitigation, but that the two outcomes are not necessarily correlated. This finding serves as a warning that intrinsic bias test sets, based either on language modeling tasks or next sentence prediction, should not be the only benchmark in developing a debiased language model.

Via

Access Paper or Ask Questions

Socially Aware Synthetic Data Generation for Suicidal Ideation Detection Using Large Language Models

Jan 25, 2024

Hamideh Ghanadian, Isar Nejadgholi, Hussein Al Osman

Abstract:Suicidal ideation detection is a vital research area that holds great potential for improving mental health support systems. However, the sensitivity surrounding suicide-related data poses challenges in accessing large-scale, annotated datasets necessary for training effective machine learning models. To address this limitation, we introduce an innovative strategy that leverages the capabilities of generative AI models, such as ChatGPT, Flan-T5, and Llama, to create synthetic data for suicidal ideation detection. Our data generation approach is grounded in social factors extracted from psychology literature and aims to ensure coverage of essential information related to suicidal ideation. In our study, we benchmarked against state-of-the-art NLP classification models, specifically, those centered around the BERT family structures. When trained on the real-world dataset, UMD, these conventional models tend to yield F1-scores ranging from 0.75 to 0.87. Our synthetic data-driven method, informed by social factors, offers consistent F1-scores of 0.82 for both models, suggesting that the richness of topics in synthetic data can bridge the performance gap across different model complexities. Most impressively, when we combined a mere 30% of the UMD dataset with our synthetic data, we witnessed a substantial increase in performance, achieving an F1-score of 0.88 on the UMD test set. Such results underscore the cost-effectiveness and potential of our approach in confronting major challenges in the field, such as data scarcity and the quest for diversity in data representation.

* IEEE Access

Via

Access Paper or Ask Questions

Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

Jul 04, 2023

Isar Nejadgholi, Svetlana Kiritchenko, Kathleen C. Fraser, Esma Balkır

Figure 1 for Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

Figure 2 for Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

Figure 3 for Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

Figure 4 for Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

Abstract:Classifiers tend to learn a false causal relationship between an over-represented concept and a label, which can result in over-reliance on the concept and compromised classification accuracy. It is imperative to have methods in place that can compare different models and identify over-reliances on specific concepts. We consider three well-known abusive language classifiers trained on large English datasets and focus on the concept of negative emotions, which is an important signal but should not be learned as a sufficient feature for the label of abuse. Motivated by the definition of global sufficiency, we first examine the unwanted dependencies learned by the classifiers by assessing their accuracy on a challenge set across all decision thresholds. Further, recognizing that a challenge set might not always be available, we introduce concept-based explanation metrics to assess the influence of the concept on the labels. These explanations allow us to compare classifiers regarding the degree of false global sufficiency they have learned between a concept and a label.

* Published at WOAH2023 co-located with ACL2023

Via

Access Paper or Ask Questions

ChatGPT for Suicide Risk Assessment on Social Media: Quantitative Evaluation of Model Performance, Potentials and Limitations

Jun 15, 2023

Hamideh Ghanadian, Isar Nejadgholi, Hussein Al Osman

Figure 1 for ChatGPT for Suicide Risk Assessment on Social Media: Quantitative Evaluation of Model Performance, Potentials and Limitations

Figure 2 for ChatGPT for Suicide Risk Assessment on Social Media: Quantitative Evaluation of Model Performance, Potentials and Limitations

Figure 3 for ChatGPT for Suicide Risk Assessment on Social Media: Quantitative Evaluation of Model Performance, Potentials and Limitations

Figure 4 for ChatGPT for Suicide Risk Assessment on Social Media: Quantitative Evaluation of Model Performance, Potentials and Limitations

Abstract:This paper presents a novel framework for quantitatively evaluating the interactive ChatGPT model in the context of suicidality assessment from social media posts, utilizing the University of Maryland Reddit suicidality dataset. We conduct a technical evaluation of ChatGPT's performance on this task using Zero-Shot and Few-Shot experiments and compare its results with those of two fine-tuned transformer-based models. Additionally, we investigate the impact of different temperature parameters on ChatGPT's response generation and discuss the optimal temperature based on the inconclusiveness rate of ChatGPT. Our results indicate that while ChatGPT attains considerable accuracy in this task, transformer-based models fine-tuned on human-annotated datasets exhibit superior performance. Moreover, our analysis sheds light on how adjusting the ChatGPT's hyperparameters can improve its ability to assist mental health professionals in this critical task.

Via

Access Paper or Ask Questions

The crime of being poor

Mar 24, 2023

Georgina Curto, Svetlana Kiritchenko, Isar Nejadgholi, Kathleen C. Fraser

Abstract:The criminalization of poverty has been widely denounced as a collective bias against the most vulnerable. NGOs and international organizations claim that the poor are blamed for their situation, are more often associated with criminal offenses than the wealthy strata of society and even incur criminal offenses simply as a result of being poor. While no evidence has been found in the literature that correlates poverty and overall criminality rates, this paper offers evidence of a collective belief that associates both concepts. This brief report measures the societal bias that correlates criminality with the poor, as compared to the rich, by using Natural Language Processing (NLP) techniques in Twitter. The paper quantifies the level of crime-poverty bias in a panel of eight different English-speaking countries. The regional differences in the association between crime and poverty cannot be justified based on different levels of inequality or unemployment, which the literature correlates to property crimes. The variation in the observed rates of crime-poverty bias for different geographic locations could be influenced by cultural factors and the tendency to overestimate the equality of opportunities and social mobility in specific countries. These results have consequences for policy-making and open a new path of research for poverty mitigation with the focus not only on the poor but on society as a whole. Acting on the collective bias against the poor would facilitate the approval of poverty reduction policies, as well as the restoration of the dignity of the persons affected.

Via

Access Paper or Ask Questions