Picture for Nikhil Ghosh

Nikhil Ghosh

GSM-Agent: Understanding Agentic Reasoning Using Controllable Environments

Add code
Sep 26, 2025
Viaarxiv icon

PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models

Add code
Jun 25, 2025
Figure 1 for PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models
Figure 2 for PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models
Figure 3 for PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models
Figure 4 for PLoP: Precise LoRA Placement for Efficient Finetuning of Large Models
Viaarxiv icon

The Impact of Initialization on LoRA Finetuning Dynamics

Add code
Jun 12, 2024
Figure 1 for The Impact of Initialization on LoRA Finetuning Dynamics
Figure 2 for The Impact of Initialization on LoRA Finetuning Dynamics
Figure 3 for The Impact of Initialization on LoRA Finetuning Dynamics
Figure 4 for The Impact of Initialization on LoRA Finetuning Dynamics
Viaarxiv icon

LoRA+: Efficient Low Rank Adaptation of Large Models

Add code
Feb 19, 2024
Figure 1 for LoRA+: Efficient Low Rank Adaptation of Large Models
Figure 2 for LoRA+: Efficient Low Rank Adaptation of Large Models
Figure 3 for LoRA+: Efficient Low Rank Adaptation of Large Models
Figure 4 for LoRA+: Efficient Low Rank Adaptation of Large Models
Viaarxiv icon

More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory

Add code
Nov 27, 2023
Figure 1 for More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory
Figure 2 for More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory
Figure 3 for More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory
Figure 4 for More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory
Viaarxiv icon

The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning

Add code
Aug 06, 2023
Figure 1 for The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning
Figure 2 for The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning
Viaarxiv icon

The Power of External Memory in Increasing Predictive Model Capacity

Add code
Jan 31, 2023
Figure 1 for The Power of External Memory in Increasing Predictive Model Capacity
Figure 2 for The Power of External Memory in Increasing Predictive Model Capacity
Figure 3 for The Power of External Memory in Increasing Predictive Model Capacity
Figure 4 for The Power of External Memory in Increasing Predictive Model Capacity
Viaarxiv icon

Alternating Updates for Efficient Transformers

Add code
Jan 30, 2023
Figure 1 for Alternating Updates for Efficient Transformers
Figure 2 for Alternating Updates for Efficient Transformers
Figure 3 for Alternating Updates for Efficient Transformers
Figure 4 for Alternating Updates for Efficient Transformers
Viaarxiv icon

A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors

Add code
Jul 23, 2022
Figure 1 for A Universal Trade-off Between the Model Size, Test Loss, and Training Loss of Linear Predictors
Viaarxiv icon

Deconstructing Distributions: A Pointwise Framework of Learning

Add code
Feb 20, 2022
Figure 1 for Deconstructing Distributions: A Pointwise Framework of Learning
Figure 2 for Deconstructing Distributions: A Pointwise Framework of Learning
Figure 3 for Deconstructing Distributions: A Pointwise Framework of Learning
Figure 4 for Deconstructing Distributions: A Pointwise Framework of Learning
Viaarxiv icon