Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Oct 18, 2023

Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig(+1 more)

Figure 1 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Figure 2 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Figure 3 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Figure 4 for SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Share this with someone who'll enjoy it:

Abstract:Humans are social beings; we pursue social goals in our daily interactions, which is a crucial aspect of social intelligence. Yet, AI systems' abilities in this realm remain elusive. We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and evaluate their social intelligence. In our environment, agents role-play and interact under a wide variety of scenarios; they coordinate, collaborate, exchange, and compete with each other to achieve complex social goals. We simulate the role-play interaction between LLM-based agents and humans within this task space and evaluate their performance with a holistic evaluation framework called SOTOPIA-Eval. With SOTOPIA, we find significant differences between these models in terms of their social intelligence, and we identify a subset of SOTOPIA scenarios, SOTOPIA-hard, that is generally challenging for all models. We find that on this subset, GPT-4 achieves a significantly lower goal completion rate than humans and struggles to exhibit social commonsense reasoning and strategic communication skills. These findings demonstrate SOTOPIA's promise as a general platform for research on evaluating and improving social intelligence in artificial agents.

* Preprint, 43 pages. The first two authors contribute equally

View paper on

Share this with someone who'll enjoy it:

Title:SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Paper and Code