Picture for Lionel Levine

Lionel Levine

Exploring a Datasets Statistical Effect Size Impact on Model Performance, and Data Sample-Size Sufficiency

Add code
Jan 05, 2025
Viaarxiv icon

FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

Add code
Nov 07, 2024
Figure 1 for FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Figure 2 for FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Figure 3 for FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Figure 4 for FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
Viaarxiv icon

Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability

Add code
Aug 15, 2024
Figure 1 for Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability
Figure 2 for Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability
Figure 3 for Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability
Figure 4 for Exploring Cross-model Neuronal Correlations in the Context of Predicting Model Performance and Generalizability
Viaarxiv icon

Do language models plan ahead for future tokens?

Add code
Apr 01, 2024
Viaarxiv icon

A Self-supervised Framework for Improved Data-Driven Monitoring of Stress via Multi-modal Passive Sensing

Add code
Mar 24, 2023
Figure 1 for A Self-supervised Framework for Improved Data-Driven Monitoring of Stress via Multi-modal Passive Sensing
Figure 2 for A Self-supervised Framework for Improved Data-Driven Monitoring of Stress via Multi-modal Passive Sensing
Figure 3 for A Self-supervised Framework for Improved Data-Driven Monitoring of Stress via Multi-modal Passive Sensing
Figure 4 for A Self-supervised Framework for Improved Data-Driven Monitoring of Stress via Multi-modal Passive Sensing
Viaarxiv icon