Picture for Shashwat Goel

Shashwat Goel

Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

Add code
Feb 26, 2025
Viaarxiv icon

Great Models Think Alike and this Undermines AI Oversight

Add code
Feb 06, 2025
Viaarxiv icon

A Cognac shot to forget bad memories: Corrective Unlearning in GNNs

Add code
Dec 01, 2024
Figure 1 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs
Figure 2 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs
Figure 3 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs
Figure 4 for A Cognac shot to forget bad memories: Corrective Unlearning in GNNs
Viaarxiv icon

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Add code
Mar 06, 2024
Figure 1 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 2 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 3 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 4 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Viaarxiv icon

Corrective Machine Unlearning

Add code
Feb 21, 2024
Figure 1 for Corrective Machine Unlearning
Figure 2 for Corrective Machine Unlearning
Figure 3 for Corrective Machine Unlearning
Figure 4 for Corrective Machine Unlearning
Viaarxiv icon

Representation Engineering: A Top-Down Approach to AI Transparency

Add code
Oct 10, 2023
Figure 1 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 2 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 3 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 4 for Representation Engineering: A Top-Down Approach to AI Transparency
Viaarxiv icon

Proportional Aggregation of Preferences for Sequential Decision Making

Add code
Jun 26, 2023
Viaarxiv icon

Low impact agency: review and discussion

Add code
Mar 06, 2023
Figure 1 for Low impact agency: review and discussion
Figure 2 for Low impact agency: review and discussion
Figure 3 for Low impact agency: review and discussion
Figure 4 for Low impact agency: review and discussion
Viaarxiv icon

Evaluating Inexact Unlearning Requires Revisiting Forgetting

Add code
Jan 17, 2022
Figure 1 for Evaluating Inexact Unlearning Requires Revisiting Forgetting
Figure 2 for Evaluating Inexact Unlearning Requires Revisiting Forgetting
Figure 3 for Evaluating Inexact Unlearning Requires Revisiting Forgetting
Figure 4 for Evaluating Inexact Unlearning Requires Revisiting Forgetting
Viaarxiv icon

From Pivots to Graphs: Augmented CycleDensity as a Generalization to One Time InverseConsultation

Add code
Aug 27, 2021
Figure 1 for From Pivots to Graphs: Augmented CycleDensity as a Generalization to One Time InverseConsultation
Figure 2 for From Pivots to Graphs: Augmented CycleDensity as a Generalization to One Time InverseConsultation
Figure 3 for From Pivots to Graphs: Augmented CycleDensity as a Generalization to One Time InverseConsultation
Figure 4 for From Pivots to Graphs: Augmented CycleDensity as a Generalization to One Time InverseConsultation
Viaarxiv icon