Abstract:Prompting serves as the major way humans interact with Large Language Models (LLM). Commercial AI systems commonly define the role of the LLM in system prompts. For example, ChatGPT uses "You are a helpful assistant" as part of the default system prompt. But is "a helpful assistant" the best role for LLMs? In this study, we present a systematic evaluation of how social roles in system prompts affect model performance. We curate a list of 162 roles covering 6 types of interpersonal relationships and 8 types of occupations. Through extensive analysis of 3 popular LLMs and 2457 questions, we show that adding interpersonal roles in prompts consistently improves the models' performance over a range of questions. Moreover, while we find that using gender-neutral roles and specifying the role as the audience leads to better performances, predicting which role leads to the best performance remains a challenging task, and that frequency, similarity, and perplexity do not fully explain the effect of social roles on model performances. Our results can help inform the design of system prompts for AI systems. Code and data are available at https://github.com/Jiaxin-Pei/Prompting-with-Social-Roles.