Picture for Stuart Armstrong

Stuart Armstrong

CoinRun: Solving Goal Misgeneralisation

Add code
Sep 28, 2023
Viaarxiv icon

Concept Extrapolation: A Conceptual Primer

Add code
Jun 19, 2023
Viaarxiv icon

Recognising the importance of preference change: A call for a coordinated multidisciplinary research effort in the age of AI

Add code
Mar 30, 2022
Figure 1 for Recognising the importance of preference change: A call for a coordinated multidisciplinary research effort in the age of AI
Viaarxiv icon

The dangers in algorithms learning humans' values and irrationalities

Add code
Mar 01, 2022
Figure 1 for The dangers in algorithms learning humans' values and irrationalities
Figure 2 for The dangers in algorithms learning humans' values and irrationalities
Figure 3 for The dangers in algorithms learning humans' values and irrationalities
Viaarxiv icon

Chess as a Testing Grounds for the Oracle Approach to AI Safety

Add code
Oct 06, 2020
Viaarxiv icon

Pitfalls of learning a reward function online

Add code
Apr 28, 2020
Figure 1 for Pitfalls of learning a reward function online
Figure 2 for Pitfalls of learning a reward function online
Figure 3 for Pitfalls of learning a reward function online
Figure 4 for Pitfalls of learning a reward function online
Viaarxiv icon

Occam's razor is insufficient to infer the preferences of irrational agents

Add code
Oct 29, 2018
Viaarxiv icon

Good and safe uses of AI Oracles

Add code
Jun 05, 2018
Figure 1 for Good and safe uses of AI Oracles
Figure 2 for Good and safe uses of AI Oracles
Figure 3 for Good and safe uses of AI Oracles
Figure 4 for Good and safe uses of AI Oracles
Viaarxiv icon

'Indifference' methods for managing agent rewards

Add code
Jun 05, 2018
Figure 1 for 'Indifference' methods for managing agent rewards
Viaarxiv icon

Counterfactual equivalence for POMDPs, and underlying deterministic environments

Add code
Jan 14, 2018
Figure 1 for Counterfactual equivalence for POMDPs, and underlying deterministic environments
Figure 2 for Counterfactual equivalence for POMDPs, and underlying deterministic environments
Figure 3 for Counterfactual equivalence for POMDPs, and underlying deterministic environments
Figure 4 for Counterfactual equivalence for POMDPs, and underlying deterministic environments
Viaarxiv icon