Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alpin Dale

PIPPA: A Partially Synthetic Conversational Dataset

Aug 11, 2023

Tear Gosling, Alpin Dale, Yinhe Zheng

Figure 1 for PIPPA: A Partially Synthetic Conversational Dataset

Figure 2 for PIPPA: A Partially Synthetic Conversational Dataset

Figure 3 for PIPPA: A Partially Synthetic Conversational Dataset

Figure 4 for PIPPA: A Partially Synthetic Conversational Dataset

Abstract:With the emergence of increasingly powerful large language models, there is a burgeoning interest in leveraging these models for casual conversation and role-play applications. However, existing conversational and role-playing datasets often fail to capture the diverse and nuanced interactions typically exhibited by real-world role-play participants. To address this limitation and contribute to the rapidly growing field, we introduce a partially-synthetic dataset named PIPPA (Personal Interaction Pairs between People and AI). PIPPA is a result of a community-driven crowdsourcing effort involving a group of role-play enthusiasts. The dataset comprises over 1 million utterances that are distributed across 26,000 conversation sessions and provides a rich resource for researchers and AI developers to explore and refine conversational AI systems in the context of role-play scenarios.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions