Picture for Li Lucy

Li Lucy

On Classification with Large Language Models in Cultural Analytics

Add code
Oct 15, 2024
Viaarxiv icon

Evaluating Language Model Math Reasoning via Grounding in Educational Curricula

Add code
Aug 08, 2024
Viaarxiv icon

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Add code
Jan 31, 2024
Figure 1 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 2 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 3 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Figure 4 for Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Viaarxiv icon

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

Add code
Jan 16, 2024
Viaarxiv icon

"One-size-fits-all"? Observations and Expectations of NLG Systems Across Identity-Related Language Features

Add code
Oct 23, 2023
Viaarxiv icon

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

Add code
Dec 19, 2022
Figure 1 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Figure 2 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Figure 3 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Figure 4 for Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications
Viaarxiv icon

Characterizing English Variation across Social Media Communities with BERT

Add code
Feb 12, 2021
Figure 1 for Characterizing English Variation across Social Media Communities with BERT
Figure 2 for Characterizing English Variation across Social Media Communities with BERT
Figure 3 for Characterizing English Variation across Social Media Communities with BERT
Figure 4 for Characterizing English Variation across Social Media Communities with BERT
Viaarxiv icon

Using Sentiment Induction to Understand Variation in Gendered Online Communities

Add code
Nov 16, 2018
Figure 1 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Figure 2 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Figure 3 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Figure 4 for Using Sentiment Induction to Understand Variation in Gendered Online Communities
Viaarxiv icon

Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning

Add code
May 31, 2017
Figure 1 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Figure 2 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Figure 3 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Figure 4 for Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning
Viaarxiv icon