Picture for Richard Ngo

Richard Ngo

Avoiding Tampering Incentives in Deep RL via Decoupled Approval

Add code
Nov 17, 2020
Figure 1 for Avoiding Tampering Incentives in Deep RL via Decoupled Approval
Figure 2 for Avoiding Tampering Incentives in Deep RL via Decoupled Approval
Figure 3 for Avoiding Tampering Incentives in Deep RL via Decoupled Approval
Figure 4 for Avoiding Tampering Incentives in Deep RL via Decoupled Approval
Viaarxiv icon

REALab: An Embedded Perspective on Tampering

Add code
Nov 17, 2020
Figure 1 for REALab: An Embedded Perspective on Tampering
Figure 2 for REALab: An Embedded Perspective on Tampering
Figure 3 for REALab: An Embedded Perspective on Tampering
Figure 4 for REALab: An Embedded Perspective on Tampering
Viaarxiv icon

Avoiding Side Effects By Considering Future Tasks

Add code
Oct 15, 2020
Figure 1 for Avoiding Side Effects By Considering Future Tasks
Figure 2 for Avoiding Side Effects By Considering Future Tasks
Figure 3 for Avoiding Side Effects By Considering Future Tasks
Figure 4 for Avoiding Side Effects By Considering Future Tasks
Viaarxiv icon