Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Assem Zhunis

Persona Setting Pitfall: Persistent Outgroup Biases in Large Language Models Arising from Social Identity Adoption

Sep 05, 2024

Wenchao Dong, Assem Zhunis, Dongyoung Jeong, Hyojin Chin, Jiyoung Han, Meeyoung Cha

Abstract:Drawing parallels between human cognition and artificial intelligence, we explored how large language models (LLMs) internalize identities imposed by targeted prompts. Informed by Social Identity Theory, these identity assignments lead LLMs to distinguish between "we" (the ingroup) and "they" (the outgroup). This self-categorization generates both ingroup favoritism and outgroup bias. Nonetheless, existing literature has predominantly focused on ingroup favoritism, often overlooking outgroup bias, which is a fundamental source of intergroup prejudice and discrimination. Our experiment addresses this gap by demonstrating that outgroup bias manifests as strongly as ingroup favoritism. Furthermore, we successfully mitigated the inherent pro-liberal, anti-conservative bias in LLMs by guiding them to adopt the perspectives of the initially disfavored group. These results were replicated in the context of gender bias. Our findings highlight the potential to develop more equitable and balanced language models.

* 23 pages, 5 figures

Via

Access Paper or Ask Questions

I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models

Feb 16, 2024

Wenchao Dong, Assem Zhunis, Hyojin Chin, Jiyoung Han, Meeyoung Cha

Figure 1 for I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models

Figure 2 for I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models

Figure 3 for I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models

Figure 4 for I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models

Abstract:We explored cultural biases-individualism vs. collectivism-in ChatGPT across three Western languages (i.e., English, German, and French) and three Eastern languages (i.e., Chinese, Japanese, and Korean). When ChatGPT adopted an individualistic persona in Western languages, its collectivism scores (i.e., out-group values) exhibited a more negative trend, surpassing their positive orientation towards individualism (i.e., in-group values). Conversely, when a collectivistic persona was assigned to ChatGPT in Eastern languages, a similar pattern emerged with more negative responses toward individualism (i.e., out-group values) as compared to collectivism (i.e., in-group values). The results indicate that when imbued with a particular social identity, ChatGPT discerns in-group and out-group, embracing in-group values while eschewing out-group values. Notably, the negativity towards the out-group, from which prejudices and discrimination arise, exceeded the positivity towards the in-group. The experiment was replicated in the political domain, and the results remained consistent. Furthermore, this replication unveiled an intrinsic Democratic bias in Large Language Models (LLMs), aligning with earlier findings and providing integral insights into mitigating such bias through prompt engineering. Extensive robustness checks were performed using varying hyperparameter and persona setup methods, with or without social identity labels, across other popular language models.

Via

Access Paper or Ask Questions