Picture for Mor Geva

Mor Geva

Shammie

Enhancing Automated Interpretability with Output-Centric Feature Descriptions

Add code
Jan 14, 2025
Viaarxiv icon

Open Problems in Machine Unlearning for AI Safety

Add code
Jan 09, 2025
Viaarxiv icon

Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models

Add code
Dec 18, 2024
Figure 1 for Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Figure 2 for Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Figure 3 for Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Figure 4 for Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Viaarxiv icon

Inferring Functionality of Attention Heads from their Parameters

Add code
Dec 16, 2024
Viaarxiv icon

Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?

Add code
Nov 25, 2024
Viaarxiv icon

Eliciting Textual Descriptions from Representations of Continuous Prompts

Add code
Oct 15, 2024
Figure 1 for Eliciting Textual Descriptions from Representations of Continuous Prompts
Figure 2 for Eliciting Textual Descriptions from Representations of Continuous Prompts
Figure 3 for Eliciting Textual Descriptions from Representations of Continuous Prompts
Figure 4 for Eliciting Textual Descriptions from Representations of Continuous Prompts
Viaarxiv icon

Language Models Encode Numbers Using Digit Representations in Base 10

Add code
Oct 15, 2024
Viaarxiv icon

Towards Interpreting Visual Information Processing in Vision-Language Models

Add code
Oct 09, 2024
Figure 1 for Towards Interpreting Visual Information Processing in Vision-Language Models
Figure 2 for Towards Interpreting Visual Information Processing in Vision-Language Models
Figure 3 for Towards Interpreting Visual Information Processing in Vision-Language Models
Figure 4 for Towards Interpreting Visual Information Processing in Vision-Language Models
Viaarxiv icon

CoverBench: A Challenging Benchmark for Complex Claim Verification

Add code
Aug 06, 2024
Viaarxiv icon

When Can Transformers Count to n?

Add code
Jul 21, 2024
Viaarxiv icon