Picture for Suriya Gunasekar

Suriya Gunasekar

Microsoft Research

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Figure 1 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 2 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 3 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Figure 4 for Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Viaarxiv icon

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

Add code
Oct 24, 2023
Viaarxiv icon

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Add code
Sep 26, 2023
Viaarxiv icon

Textbooks Are All You Need II: phi-1.5 technical report

Add code
Sep 11, 2023
Figure 1 for Textbooks Are All You Need II: phi-1.5 technical report
Figure 2 for Textbooks Are All You Need II: phi-1.5 technical report
Figure 3 for Textbooks Are All You Need II: phi-1.5 technical report
Figure 4 for Textbooks Are All You Need II: phi-1.5 technical report
Viaarxiv icon

Textbooks Are All You Need

Add code
Jun 20, 2023
Figure 1 for Textbooks Are All You Need
Figure 2 for Textbooks Are All You Need
Figure 3 for Textbooks Are All You Need
Figure 4 for Textbooks Are All You Need
Viaarxiv icon

(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability

Add code
Feb 17, 2023
Viaarxiv icon

How to Fine-Tune Vision Models with SGD

Add code
Nov 17, 2022
Viaarxiv icon

Neural-Sim: Learning to Generate Training Data with NeRF

Add code
Jul 22, 2022
Figure 1 for Neural-Sim: Learning to Generate Training Data with NeRF
Figure 2 for Neural-Sim: Learning to Generate Training Data with NeRF
Figure 3 for Neural-Sim: Learning to Generate Training Data with NeRF
Figure 4 for Neural-Sim: Learning to Generate Training Data with NeRF
Viaarxiv icon

Generalization to translation shifts: a study in architectures and augmentations

Add code
Jul 05, 2022
Figure 1 for Generalization to translation shifts: a study in architectures and augmentations
Figure 2 for Generalization to translation shifts: a study in architectures and augmentations
Figure 3 for Generalization to translation shifts: a study in architectures and augmentations
Figure 4 for Generalization to translation shifts: a study in architectures and augmentations
Viaarxiv icon

Unveiling Transformers with LEGO: a synthetic reasoning task

Add code
Jun 09, 2022
Figure 1 for Unveiling Transformers with LEGO: a synthetic reasoning task
Figure 2 for Unveiling Transformers with LEGO: a synthetic reasoning task
Figure 3 for Unveiling Transformers with LEGO: a synthetic reasoning task
Figure 4 for Unveiling Transformers with LEGO: a synthetic reasoning task
Viaarxiv icon