Picture for Preetum Nakkiran

Preetum Nakkiran

A Formal Framework for Understanding Length Generalization in Transformers

Add code
Oct 03, 2024
Figure 1 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 2 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 3 for A Formal Framework for Understanding Length Generalization in Transformers
Figure 4 for A Formal Framework for Understanding Length Generalization in Transformers
Viaarxiv icon

Classifier-Free Guidance is a Predictor-Corrector

Add code
Aug 16, 2024
Viaarxiv icon

How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

Add code
Jul 03, 2024
Viaarxiv icon

Step-by-Step Diffusion: An Elementary Tutorial

Add code
Jun 13, 2024
Figure 1 for Step-by-Step Diffusion: An Elementary Tutorial
Figure 2 for Step-by-Step Diffusion: An Elementary Tutorial
Figure 3 for Step-by-Step Diffusion: An Elementary Tutorial
Figure 4 for Step-by-Step Diffusion: An Elementary Tutorial
Viaarxiv icon

When is Multicalibration Post-Processing Necessary?

Add code
Jun 10, 2024
Viaarxiv icon

Perspectives on the State and Future of Deep Learning - 2023

Add code
Dec 19, 2023
Viaarxiv icon

LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

Add code
Dec 07, 2023
Viaarxiv icon

Vanishing Gradients in Reinforcement Finetuning of Language Models

Add code
Oct 31, 2023
Figure 1 for Vanishing Gradients in Reinforcement Finetuning of Language Models
Figure 2 for Vanishing Gradients in Reinforcement Finetuning of Language Models
Figure 3 for Vanishing Gradients in Reinforcement Finetuning of Language Models
Figure 4 for Vanishing Gradients in Reinforcement Finetuning of Language Models
Viaarxiv icon

What Algorithms can Transformers Learn? A Study in Length Generalization

Add code
Oct 24, 2023
Viaarxiv icon

Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing

Add code
Sep 21, 2023
Viaarxiv icon