Picture for Thomas Icard

Thomas Icard

Belief in the Machine: Investigating Epistemological Blind Spots of Language Models

Add code
Oct 28, 2024
Viaarxiv icon

A Reply to Makelov et al. 's "Interpretability Illusion" Arguments

Add code
Jan 23, 2024
Viaarxiv icon

Comparing Causal Frameworks: Potential Outcomes, Structural Models, Graphs, and Abstractions

Add code
Jun 25, 2023
Viaarxiv icon

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

Add code
Mar 05, 2023
Viaarxiv icon

Causal Abstraction for Faithful Model Interpretation

Add code
Jan 11, 2023
Viaarxiv icon

Causal Abstraction with Soft Interventions

Add code
Nov 22, 2022
Viaarxiv icon

Holistic Evaluation of Language Models

Add code
Nov 16, 2022
Viaarxiv icon

Causal Distillation for Language Models

Add code
Dec 05, 2021
Figure 1 for Causal Distillation for Language Models
Figure 2 for Causal Distillation for Language Models
Figure 3 for Causal Distillation for Language Models
Figure 4 for Causal Distillation for Language Models
Viaarxiv icon

Inducing Causal Structure for Interpretable Neural Networks

Add code
Dec 01, 2021
Figure 1 for Inducing Causal Structure for Interpretable Neural Networks
Figure 2 for Inducing Causal Structure for Interpretable Neural Networks
Figure 3 for Inducing Causal Structure for Interpretable Neural Networks
Figure 4 for Inducing Causal Structure for Interpretable Neural Networks
Viaarxiv icon

On the Opportunities and Risks of Foundation Models

Add code
Aug 18, 2021
Figure 1 for On the Opportunities and Risks of Foundation Models
Figure 2 for On the Opportunities and Risks of Foundation Models
Figure 3 for On the Opportunities and Risks of Foundation Models
Figure 4 for On the Opportunities and Risks of Foundation Models
Viaarxiv icon