Picture for Christopher Olah

Christopher Olah

The Capacity for Moral Self-Correction in Large Language Models

Add code
Feb 18, 2023
Figure 1 for The Capacity for Moral Self-Correction in Large Language Models
Figure 2 for The Capacity for Moral Self-Correction in Large Language Models
Figure 3 for The Capacity for Moral Self-Correction in Large Language Models
Figure 4 for The Capacity for Moral Self-Correction in Large Language Models
Viaarxiv icon

Discovering Language Model Behaviors with Model-Written Evaluations

Add code
Dec 19, 2022
Viaarxiv icon

Constitutional AI: Harmlessness from AI Feedback

Add code
Dec 15, 2022
Figure 1 for Constitutional AI: Harmlessness from AI Feedback
Figure 2 for Constitutional AI: Harmlessness from AI Feedback
Figure 3 for Constitutional AI: Harmlessness from AI Feedback
Figure 4 for Constitutional AI: Harmlessness from AI Feedback
Viaarxiv icon

Measuring Progress on Scalable Oversight for Large Language Models

Add code
Nov 11, 2022
Figure 1 for Measuring Progress on Scalable Oversight for Large Language Models
Figure 2 for Measuring Progress on Scalable Oversight for Large Language Models
Figure 3 for Measuring Progress on Scalable Oversight for Large Language Models
Viaarxiv icon

Toy Models of Superposition

Add code
Sep 21, 2022
Viaarxiv icon

Is Generator Conditioning Causally Related to GAN Performance?

Add code
Jun 19, 2018
Figure 1 for Is Generator Conditioning Causally Related to GAN Performance?
Figure 2 for Is Generator Conditioning Causally Related to GAN Performance?
Figure 3 for Is Generator Conditioning Causally Related to GAN Performance?
Figure 4 for Is Generator Conditioning Causally Related to GAN Performance?
Viaarxiv icon

Conditional Image Synthesis With Auxiliary Classifier GANs

Add code
Jul 20, 2017
Figure 1 for Conditional Image Synthesis With Auxiliary Classifier GANs
Figure 2 for Conditional Image Synthesis With Auxiliary Classifier GANs
Figure 3 for Conditional Image Synthesis With Auxiliary Classifier GANs
Figure 4 for Conditional Image Synthesis With Auxiliary Classifier GANs
Viaarxiv icon

Changing Model Behavior at Test-Time Using Reinforcement Learning

Add code
Feb 24, 2017
Figure 1 for Changing Model Behavior at Test-Time Using Reinforcement Learning
Figure 2 for Changing Model Behavior at Test-Time Using Reinforcement Learning
Figure 3 for Changing Model Behavior at Test-Time Using Reinforcement Learning
Figure 4 for Changing Model Behavior at Test-Time Using Reinforcement Learning
Viaarxiv icon

Document Embedding with Paragraph Vectors

Add code
Jul 29, 2015
Figure 1 for Document Embedding with Paragraph Vectors
Figure 2 for Document Embedding with Paragraph Vectors
Figure 3 for Document Embedding with Paragraph Vectors
Figure 4 for Document Embedding with Paragraph Vectors
Viaarxiv icon