Picture for Andrey Gromov

Andrey Gromov

Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

Add code
Jun 13, 2024
Viaarxiv icon

Grokking Modular Polynomials

Add code
Jun 05, 2024
Viaarxiv icon

Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks

Add code
Jun 04, 2024
Viaarxiv icon

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

Add code
Apr 01, 2024
Viaarxiv icon

The Unreasonable Ineffectiveness of the Deeper Layers

Add code
Mar 26, 2024
Viaarxiv icon

Bridging Associative Memory and Probabilistic Modeling

Add code
Feb 15, 2024
Viaarxiv icon

To grok or not to grok: Disentangling generalization and memorization on corrupted algorithmic datasets

Add code
Oct 19, 2023
Viaarxiv icon

Grokking modular arithmetic

Add code
Jan 06, 2023
Viaarxiv icon

AutoInit: Automatic Initialization via Jacobian Tuning

Add code
Jun 27, 2022
Figure 1 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 2 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 3 for AutoInit: Automatic Initialization via Jacobian Tuning
Figure 4 for AutoInit: Automatic Initialization via Jacobian Tuning
Viaarxiv icon

Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm

Add code
Nov 30, 2021
Figure 1 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 2 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 3 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Figure 4 for Critical initialization of wide and deep neural networks through partial Jacobians: general theory and applications to LayerNorm
Viaarxiv icon