Picture for Helen Jin

Helen Jin

Probabilistic Stability Guarantees for Feature Attributions

Add code
Apr 18, 2025
Viaarxiv icon

Adaptively evaluating models with task elicitation

Add code
Mar 03, 2025
Viaarxiv icon

The FIX Benchmark: Extracting Features Interpretable to eXperts

Add code
Sep 20, 2024
Figure 1 for The FIX Benchmark: Extracting Features Interpretable to eXperts
Figure 2 for The FIX Benchmark: Extracting Features Interpretable to eXperts
Figure 3 for The FIX Benchmark: Extracting Features Interpretable to eXperts
Figure 4 for The FIX Benchmark: Extracting Features Interpretable to eXperts
Viaarxiv icon

Linguistic Properties of Truthful Response

Add code
May 25, 2023
Figure 1 for Linguistic Properties of Truthful Response
Figure 2 for Linguistic Properties of Truthful Response
Figure 3 for Linguistic Properties of Truthful Response
Figure 4 for Linguistic Properties of Truthful Response
Viaarxiv icon

Generic Temporal Reasoning with Differential Analysis and Explanation

Add code
Dec 20, 2022
Viaarxiv icon

Artificial Perceptual Learning: Image Categorization with Weak Supervision

Add code
Jun 02, 2021
Figure 1 for Artificial Perceptual Learning: Image Categorization with Weak Supervision
Figure 2 for Artificial Perceptual Learning: Image Categorization with Weak Supervision
Figure 3 for Artificial Perceptual Learning: Image Categorization with Weak Supervision
Figure 4 for Artificial Perceptual Learning: Image Categorization with Weak Supervision
Viaarxiv icon