Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Betty Li Hou

How To Think About End-To-End Encryption and AI: Training, Processing, Disclosure, and Consent

Dec 28, 2024

Mallory Knodel, Andrés Fábrega, Daniella Ferrari, Jacob Leiken, Betty Li Hou, Derek Yen, Sam de Alfaro, Kyunghyun Cho, Sunoo Park

Abstract:End-to-end encryption (E2EE) has become the gold standard for securing communications, bringing strong confidentiality and privacy guarantees to billions of users worldwide. However, the current push towards widespread integration of artificial intelligence (AI) models, including in E2EE systems, raises some serious security concerns. This work performs a critical examination of the (in)compatibility of AI models and E2EE applications. We explore this on two fronts: (1) the integration of AI "assistants" within E2EE applications, and (2) the use of E2EE data for training AI models. We analyze the potential security implications of each, and identify conflicts with the security guarantees of E2EE. Then, we analyze legal implications of integrating AI models in E2EE applications, given how AI integration can undermine the confidentiality that E2EE promises. Finally, we offer a list of detailed recommendations based on our technical and legal analyses, including: technical design choices that must be prioritized to uphold E2EE security; how service providers must accurately represent E2EE security; and best practices for the default behavior of AI features and for requesting user consent. We hope this paper catalyzes an informed conversation on the tensions that arise between the brisk deployment of AI and the security offered by E2EE, and guides the responsible development of new AI features.

Via

Access Paper or Ask Questions

Large Language Models as Misleading Assistants in Conversation

Jul 16, 2024

Betty Li Hou, Kejian Shi, Jason Phang, James Aung, Steven Adler, Rosie Campbell

Abstract:Large Language Models (LLMs) are able to provide assistance on a wide range of information-seeking tasks. However, model outputs may be misleading, whether unintentionally or in cases of intentional deception. We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users. We compare outcomes of (1) when the model is prompted to provide truthful assistance, (2) when it is prompted to be subtly misleading, and (3) when it is prompted to argue for an incorrect answer. Our experiments show that GPT-4 can effectively mislead both GPT-3.5-Turbo and GPT-4, with deceptive assistants resulting in up to a 23% drop in accuracy on the task compared to when a truthful assistant is used. We also find that providing the user model with additional context from the passage partially mitigates the influence of the deceptive model. This work highlights the ability of LLMs to produce misleading information and the effects this may have in real-world situations.

* Next Generation of AI Safety Workshop, 41st International Conference on Machine Learning (ICML 2024)

Via

Access Paper or Ask Questions

Foundational Moral Values for AI Alignment

Nov 28, 2023

Betty Li Hou, Brian Patrick Green

Abstract:Solving the AI alignment problem requires having clear, defensible values towards which AI systems can align. Currently, targets for alignment remain underspecified and do not seem to be built from a philosophically robust structure. We begin the discussion of this problem by presenting five core, foundational values, drawn from moral philosophy and built on the requisites for human existence: survival, sustainable intergenerational existence, society, education, and truth. We show that these values not only provide a clearer direction for technical alignment work, but also serve as a framework to highlight threats and opportunities from AI systems to both obtain and sustain these values.

* AI meets Moral Philosophy and Moral Psychology Workshop, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

Via

Access Paper or Ask Questions

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Nov 20, 2023

David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, Samuel R. Bowman

Figure 1 for GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Figure 2 for GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Figure 3 for GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Figure 4 for GPQA: A Graduate-Level Google-Proof Q&A Benchmark

Abstract:We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the corresponding domains reach 65% accuracy (74% when discounting clear mistakes the experts identified in retrospect), while highly skilled non-expert validators only reach 34% accuracy, despite spending on average over 30 minutes with unrestricted access to the web (i.e., the questions are "Google-proof"). The questions are also difficult for state-of-the-art AI systems, with our strongest GPT-4 based baseline achieving 39% accuracy. If we are to use future AI systems to help us answer very hard questions, for example, when developing new scientific knowledge, we need to develop scalable oversight methods that enable humans to supervise their outputs, which may be difficult even if the supervisors are themselves skilled and knowledgeable. The difficulty of GPQA both for skilled non-experts and frontier AI systems should enable realistic scalable oversight experiments, which we hope can help devise ways for human experts to reliably get truthful information from AI systems that surpass human capabilities.

* 28 pages, 5 figures, 7 tables

Via

Access Paper or Ask Questions

A Multi-Level Framework for the AI Alignment Problem

Jan 10, 2023

Betty Li Hou, Brian Patrick Green

Figure 1 for A Multi-Level Framework for the AI Alignment Problem

Abstract:AI alignment considers how we can encode AI systems in a way that is compatible with human values. The normative side of this problem asks what moral values or principles, if any, we should encode in AI. To this end, we present a framework to consider the question at four levels: Individual, Organizational, National, and Global. We aim to illustrate how AI alignment is made up of value alignment problems at each of these levels, where values at each level affect the others and effects can flow in either direction. We outline key questions and considerations of each level and demonstrate an application of this framework to the topic of AI content moderation.

* ML Safety Workshop, 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

Via

Access Paper or Ask Questions