Picture for Alon Albalak

Alon Albalak

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Add code
Dec 04, 2024
Viaarxiv icon

Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data

Add code
Jul 20, 2024
Figure 1 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Figure 2 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Figure 3 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Figure 4 for Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data
Viaarxiv icon

A Mathematical Framework, a Taxonomy of Modeling Paradigms, and a Suite of Learning Techniques for Neural-Symbolic Systems

Add code
Jul 12, 2024
Figure 1 for A Mathematical Framework, a Taxonomy of Modeling Paradigms, and a Suite of Learning Techniques for Neural-Symbolic Systems
Figure 2 for A Mathematical Framework, a Taxonomy of Modeling Paradigms, and a Suite of Learning Techniques for Neural-Symbolic Systems
Figure 3 for A Mathematical Framework, a Taxonomy of Modeling Paradigms, and a Suite of Learning Techniques for Neural-Symbolic Systems
Figure 4 for A Mathematical Framework, a Taxonomy of Modeling Paradigms, and a Suite of Learning Techniques for Neural-Symbolic Systems
Viaarxiv icon

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Add code
Jun 26, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Add code
Apr 10, 2024
Figure 1 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Figure 2 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Figure 3 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Figure 4 for Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
Viaarxiv icon

A Survey on Data Selection for Language Models

Add code
Mar 08, 2024
Viaarxiv icon

Efficient Online Data Mixing For Language Model Pre-Training

Add code
Dec 05, 2023
Viaarxiv icon

RWKV: Reinventing RNNs for the Transformer Era

Add code
May 22, 2023
Viaarxiv icon

Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

Add code
May 20, 2023
Viaarxiv icon