Picture for Yair Carmon

Yair Carmon

Filter Like You Test: Data-Driven Data Filtering for CLIP Pretraining

Add code
Mar 11, 2025
Viaarxiv icon

An Analytical Model for Overparameterized Learning Under Class Imbalance

Add code
Mar 07, 2025
Viaarxiv icon

Resolving Discrepancies in Compute-Optimal Scaling of Language Models

Add code
Jun 27, 2024
Figure 1 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Figure 2 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Figure 3 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Figure 4 for Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Accelerated Parameter-Free Stochastic Optimization

Add code
Mar 31, 2024
Viaarxiv icon

Language models scale reliably with over-training and on downstream tasks

Add code
Mar 13, 2024
Figure 1 for Language models scale reliably with over-training and on downstream tasks
Figure 2 for Language models scale reliably with over-training and on downstream tasks
Figure 3 for Language models scale reliably with over-training and on downstream tasks
Figure 4 for Language models scale reliably with over-training and on downstream tasks
Viaarxiv icon

The Price of Adaptivity in Stochastic Convex Optimization

Add code
Feb 16, 2024
Viaarxiv icon

A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions

Add code
Nov 17, 2023
Viaarxiv icon

Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond

Add code
May 22, 2023
Figure 1 for Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Figure 2 for Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Figure 3 for Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Figure 4 for Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Viaarxiv icon

DataComp: In search of the next generation of multimodal datasets

Add code
May 03, 2023
Viaarxiv icon