Picture for Harsh Mehta

Harsh Mehta

Shammie

The Road Less Scheduled

Add code
May 24, 2024
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement

Add code
Oct 11, 2023
Viaarxiv icon

Mechanic: A Learning Rate Tuner

Add code
Jun 02, 2023
Viaarxiv icon

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Add code
Feb 11, 2023
Viaarxiv icon

Simplifying and Understanding State Space Models with Diagonal Linear RNNs

Add code
Dec 07, 2022
Viaarxiv icon

Differentially Private Image Classification from Features

Add code
Nov 24, 2022
Figure 1 for Differentially Private Image Classification from Features
Figure 2 for Differentially Private Image Classification from Features
Figure 3 for Differentially Private Image Classification from Features
Figure 4 for Differentially Private Image Classification from Features
Viaarxiv icon

Convexifying Transformers: Improving optimization and understanding of transformer networks

Add code
Nov 20, 2022
Figure 1 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 2 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 3 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Figure 4 for Convexifying Transformers: Improving optimization and understanding of transformer networks
Viaarxiv icon

Long Range Language Modeling via Gated State Spaces

Add code
Jul 02, 2022
Figure 1 for Long Range Language Modeling via Gated State Spaces
Figure 2 for Long Range Language Modeling via Gated State Spaces
Figure 3 for Long Range Language Modeling via Gated State Spaces
Viaarxiv icon