Picture for Nikunj Saunshi

Nikunj Saunshi

On the Role of Depth and Looping for In-Context Learning with Task Diversity

Add code
Oct 29, 2024
Figure 1 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Figure 2 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Viaarxiv icon

A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

Add code
Oct 24, 2024
Viaarxiv icon

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

Add code
Oct 10, 2024
Figure 1 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 2 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 3 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 4 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Viaarxiv icon

On the Inductive Bias of Stacking Towards Improving Reasoning

Add code
Sep 27, 2024
Viaarxiv icon

Landscape-Aware Growing: The Power of a Little LAG

Add code
Jun 04, 2024
Viaarxiv icon

Efficient Stagewise Pretraining via Progressive Subnetworks

Add code
Feb 08, 2024
Viaarxiv icon

Reasoning in Large Language Models Through Symbolic Math Word Problems

Add code
Aug 03, 2023
Viaarxiv icon

Task-Specific Skill Localization in Fine-tuned Language Models

Add code
Feb 13, 2023
Viaarxiv icon

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

Add code
Nov 05, 2022
Viaarxiv icon

Understanding Influence Functions and Datamodels via Harmonic Analysis

Add code
Oct 03, 2022
Figure 1 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 2 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 3 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Figure 4 for Understanding Influence Functions and Datamodels via Harmonic Analysis
Viaarxiv icon