Picture for Jason Ramapuram

Jason Ramapuram

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Add code
Dec 26, 2025
Viaarxiv icon

Learning Unmasking Policies for Diffusion Language Models

Add code
Dec 12, 2025
Viaarxiv icon

Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

Add code
Dec 09, 2025
Viaarxiv icon

Distillation Scaling Laws

Add code
Feb 12, 2025
Viaarxiv icon

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

Add code
Sep 06, 2024
Figure 1 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 2 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 3 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 4 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Viaarxiv icon

Poly-View Contrastive Learning

Add code
Mar 08, 2024
Figure 1 for Poly-View Contrastive Learning
Figure 2 for Poly-View Contrastive Learning
Figure 3 for Poly-View Contrastive Learning
Figure 4 for Poly-View Contrastive Learning
Viaarxiv icon

Bootstrap Your Own Variance

Add code
Dec 06, 2023
Figure 1 for Bootstrap Your Own Variance
Figure 2 for Bootstrap Your Own Variance
Figure 3 for Bootstrap Your Own Variance
Figure 4 for Bootstrap Your Own Variance
Viaarxiv icon

How to Scale Your EMA

Add code
Jul 27, 2023
Figure 1 for How to Scale Your EMA
Figure 2 for How to Scale Your EMA
Figure 3 for How to Scale Your EMA
Figure 4 for How to Scale Your EMA
Viaarxiv icon

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Add code
Jul 20, 2023
Figure 1 for The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Figure 2 for The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Figure 3 for The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Figure 4 for The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
Viaarxiv icon

DUET: 2D Structured and Approximately Equivariant Representations

Add code
Jun 30, 2023
Viaarxiv icon