Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Sep 20, 2024

Zhao Cheng, Diane Wan, Matthew Abueg, Sahra Ghalebikesabi, Ren Yi, Eugene Bagdasarian, Borja Balle, Stefan Mellem, Shawn O'Banion

Figure 1 for CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Figure 2 for CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Figure 3 for CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Figure 4 for CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Share this with someone who'll enjoy it:

Abstract:Advances in generative AI point towards a new era of personalized applications that perform diverse tasks on behalf of users. While general AI assistants have yet to fully emerge, their potential to share personal data raises significant privacy challenges. This paper introduces CI-Bench, a comprehensive synthetic benchmark for evaluating the ability of AI assistants to protect personal information during model inference. Leveraging the Contextual Integrity framework, our benchmark enables systematic assessment of information flow across important context dimensions, including roles, information types, and transmission principles. We present a novel, scalable, multi-step synthetic data pipeline for generating natural communications, including dialogues and emails. Unlike previous work with smaller, narrowly focused evaluations, we present a novel, scalable, multi-step data pipeline that synthetically generates natural communications, including dialogues and emails, which we use to generate 44 thousand test samples across eight domains. Additionally, we formulate and evaluate a naive AI assistant to demonstrate the need for further study and careful training towards personal assistant tasks. We envision CI-Bench as a valuable tool for guiding future language model development, deployment, system design, and dataset construction, ultimately contributing to the development of AI assistants that align with users' privacy expectations.

View paper on

Share this with someone who'll enjoy it:

Title:CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

Paper and Code