Picture for Adina Williams

Adina Williams

Meta AI

Chained Tuning Leads to Biased Forgetting

Add code
Dec 21, 2024
Viaarxiv icon

What makes a good metric? Evaluating automatic metrics for text-to-image consistency

Add code
Dec 18, 2024
Figure 1 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Figure 2 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Figure 3 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Figure 4 for What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Viaarxiv icon

Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora

Add code
Dec 06, 2024
Viaarxiv icon

Transformers Can Navigate Mazes With Multi-Step Prediction

Add code
Dec 06, 2024
Viaarxiv icon

On the Role of Speech Data in Reducing Toxicity Detection Bias

Add code
Nov 12, 2024
Figure 1 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Figure 2 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Figure 3 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Figure 4 for On the Role of Speech Data in Reducing Toxicity Detection Bias
Viaarxiv icon

The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models

Add code
Nov 06, 2024
Figure 1 for The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models
Figure 2 for The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models
Figure 3 for The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models
Figure 4 for The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models
Viaarxiv icon

Improving Model Evaluation using SMART Filtering of Benchmark Datasets

Add code
Oct 26, 2024
Figure 1 for Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Figure 2 for Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Figure 3 for Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Figure 4 for Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Viaarxiv icon

Are Female Carpenters like Blue Bananas? A Corpus Investigation of Occupation Gender Typicality

Add code
Aug 06, 2024
Viaarxiv icon

Changing Answer Order Can Decrease MMLU Accuracy

Add code
Jun 27, 2024
Figure 1 for Changing Answer Order Can Decrease MMLU Accuracy
Figure 2 for Changing Answer Order Can Decrease MMLU Accuracy
Figure 3 for Changing Answer Order Can Decrease MMLU Accuracy
Figure 4 for Changing Answer Order Can Decrease MMLU Accuracy
Viaarxiv icon

Decomposed evaluations of geographic disparities in text-to-image models

Add code
Jun 17, 2024
Viaarxiv icon