Picture for Iason Gabriel

Iason Gabriel

Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data

Add code
Jun 19, 2024
Figure 1 for Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data
Figure 2 for Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data
Figure 3 for Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data
Figure 4 for Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data
Viaarxiv icon

Holistic Safety and Responsibility Evaluations of Advanced AI Models

Add code
Apr 22, 2024
Viaarxiv icon

Sociotechnical Safety Evaluation of Generative AI Systems

Add code
Oct 31, 2023
Viaarxiv icon

Model evaluation for extreme risks

Add code
May 24, 2023
Figure 1 for Model evaluation for extreme risks
Figure 2 for Model evaluation for extreme risks
Figure 3 for Model evaluation for extreme risks
Figure 4 for Model evaluation for extreme risks
Viaarxiv icon

Manifestations of Xenophobia in AI Systems

Add code
Dec 15, 2022
Viaarxiv icon

A Human Rights-Based Approach to Responsible AI

Add code
Oct 06, 2022
Figure 1 for A Human Rights-Based Approach to Responsible AI
Viaarxiv icon

Improving alignment of dialogue agents via targeted human judgements

Add code
Sep 28, 2022
Figure 1 for Improving alignment of dialogue agents via targeted human judgements
Figure 2 for Improving alignment of dialogue agents via targeted human judgements
Figure 3 for Improving alignment of dialogue agents via targeted human judgements
Figure 4 for Improving alignment of dialogue agents via targeted human judgements
Viaarxiv icon

In conversation with Artificial Intelligence: aligning language models with human values

Add code
Sep 01, 2022
Viaarxiv icon

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

Add code
Jun 16, 2022
Figure 1 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 2 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 3 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Viaarxiv icon

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Add code
Dec 08, 2021
Figure 1 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 2 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 3 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 4 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Viaarxiv icon