Picture for Charles Lovering

Charles Lovering

Are Language Model Logits Calibrated?

Add code
Oct 21, 2024
Viaarxiv icon

SEC-QA: A Systematic Evaluation Corpus for Financial QA

Add code
Jun 20, 2024
Viaarxiv icon

Lessons from the Trenches on Reproducible Evaluation of Language Models

Add code
May 23, 2024
Viaarxiv icon

BizBench: A Quantitative Reasoning Benchmark for Business and Finance

Add code
Nov 11, 2023
Viaarxiv icon

Deep Neural Networks Can Learn Generalizable Same-Different Visual Relations

Add code
Oct 14, 2023
Viaarxiv icon

Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex

Add code
Nov 26, 2022
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

Self-play for Data Efficient Language Acquisition

Add code
Oct 10, 2020
Figure 1 for Self-play for Data Efficient Language Acquisition
Figure 2 for Self-play for Data Efficient Language Acquisition
Figure 3 for Self-play for Data Efficient Language Acquisition
Figure 4 for Self-play for Data Efficient Language Acquisition
Viaarxiv icon

When does data augmentation help generalization in NLP?

Add code
Apr 30, 2020
Figure 1 for When does data augmentation help generalization in NLP?
Figure 2 for When does data augmentation help generalization in NLP?
Figure 3 for When does data augmentation help generalization in NLP?
Figure 4 for When does data augmentation help generalization in NLP?
Viaarxiv icon