Picture for Ryan Soklaski

Ryan Soklaski

Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models

Add code
Jun 17, 2024
Viaarxiv icon

Fourier-Based Augmentations for Improved Robustness and Uncertainty Calibration

Add code
Feb 24, 2022
Figure 1 for Fourier-Based Augmentations for Improved Robustness and Uncertainty Calibration
Figure 2 for Fourier-Based Augmentations for Improved Robustness and Uncertainty Calibration
Figure 3 for Fourier-Based Augmentations for Improved Robustness and Uncertainty Calibration
Figure 4 for Fourier-Based Augmentations for Improved Robustness and Uncertainty Calibration
Viaarxiv icon

Tools and Practices for Responsible AI Engineering

Add code
Jan 14, 2022
Viaarxiv icon

Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning

Add code
Jul 02, 2018
Figure 1 for Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
Figure 2 for Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
Figure 3 for Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
Figure 4 for Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning
Viaarxiv icon