Picture for Sanjiv Kumar

Sanjiv Kumar

Google Research

LAuReL: Learned Augmented Residual Layer

Add code
Nov 13, 2024
Viaarxiv icon

On the Role of Depth and Looping for In-Context Learning with Task Diversity

Add code
Oct 29, 2024
Figure 1 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Figure 2 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Viaarxiv icon

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

Add code
Oct 27, 2024
Viaarxiv icon

A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

Add code
Oct 24, 2024
Figure 1 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 2 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 3 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 4 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Viaarxiv icon

No more hard prompts: SoftSRV prompting for synthetic data generation

Add code
Oct 23, 2024
Figure 1 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 2 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 3 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 4 for No more hard prompts: SoftSRV prompting for synthetic data generation
Viaarxiv icon

Mimetic Initialization Helps State Space Models Learn to Recall

Add code
Oct 14, 2024
Viaarxiv icon

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

Add code
Oct 10, 2024
Figure 1 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 2 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 3 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 4 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Viaarxiv icon

On the Inductive Bias of Stacking Towards Improving Reasoning

Add code
Sep 27, 2024
Viaarxiv icon

Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines

Add code
Jul 22, 2024
Viaarxiv icon

Efficient Document Ranking with Learnable Late Interactions

Add code
Jun 25, 2024
Figure 1 for Efficient Document Ranking with Learnable Late Interactions
Figure 2 for Efficient Document Ranking with Learnable Late Interactions
Figure 3 for Efficient Document Ranking with Learnable Late Interactions
Figure 4 for Efficient Document Ranking with Learnable Late Interactions
Viaarxiv icon