Picture for Sanjiv Kumar

Sanjiv Kumar

Google Research

LAuReL: Learned Augmented Residual Layer

Add code
Nov 13, 2024
Viaarxiv icon

On the Role of Depth and Looping for In-Context Learning with Task Diversity

Add code
Oct 29, 2024
Figure 1 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Figure 2 for On the Role of Depth and Looping for In-Context Learning with Task Diversity
Viaarxiv icon

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

Add code
Oct 27, 2024
Viaarxiv icon

A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

Add code
Oct 24, 2024
Figure 1 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 2 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 3 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 4 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Viaarxiv icon

No more hard prompts: SoftSRV prompting for synthetic data generation

Add code
Oct 23, 2024
Figure 1 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 2 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 3 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 4 for No more hard prompts: SoftSRV prompting for synthetic data generation
Viaarxiv icon

Mimetic Initialization Helps State Space Models Learn to Recall

Add code
Oct 14, 2024
Figure 1 for Mimetic Initialization Helps State Space Models Learn to Recall
Figure 2 for Mimetic Initialization Helps State Space Models Learn to Recall
Figure 3 for Mimetic Initialization Helps State Space Models Learn to Recall
Figure 4 for Mimetic Initialization Helps State Space Models Learn to Recall
Viaarxiv icon

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

Add code
Oct 10, 2024
Figure 1 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 2 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 3 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Figure 4 for Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
Viaarxiv icon

On the Inductive Bias of Stacking Towards Improving Reasoning

Add code
Sep 27, 2024
Viaarxiv icon

Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines

Add code
Jul 22, 2024
Viaarxiv icon

Efficient Document Ranking with Learnable Late Interactions

Add code
Jun 25, 2024
Figure 1 for Efficient Document Ranking with Learnable Late Interactions
Figure 2 for Efficient Document Ranking with Learnable Late Interactions
Figure 3 for Efficient Document Ranking with Learnable Late Interactions
Figure 4 for Efficient Document Ranking with Learnable Late Interactions
Viaarxiv icon