Picture for Suvrit Sra

Suvrit Sra

TU Munich

Graph Transformers Dream of Electric Flow

Add code
Oct 22, 2024
Figure 1 for Graph Transformers Dream of Electric Flow
Figure 2 for Graph Transformers Dream of Electric Flow
Figure 3 for Graph Transformers Dream of Electric Flow
Figure 4 for Graph Transformers Dream of Electric Flow
Viaarxiv icon

Memory-augmented Transformers can implement Linear First-Order Optimization Methods

Add code
Oct 08, 2024
Viaarxiv icon

First-Order Methods for Linearly Constrained Bilevel Optimization

Add code
Jun 18, 2024
Viaarxiv icon

Riemannian Bilevel Optimization

Add code
May 22, 2024
Viaarxiv icon

Efficient Sampling on Riemannian Manifolds via Langevin MCMC

Add code
Feb 15, 2024
Viaarxiv icon

Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context

Add code
Dec 26, 2023
Viaarxiv icon

Linear attention is (maybe) all you need (to understand transformer optimization)

Add code
Oct 02, 2023
Figure 1 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 2 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 3 for Linear attention is (maybe) all you need (to understand transformer optimization)
Figure 4 for Linear attention is (maybe) all you need (to understand transformer optimization)
Viaarxiv icon

Invex Programs: First Order Algorithms and Their Convergence

Add code
Jul 10, 2023
Viaarxiv icon

Transformers learn to implement preconditioned gradient descent for in-context learning

Add code
Jun 01, 2023
Viaarxiv icon

How to escape sharp minima

Add code
May 25, 2023
Viaarxiv icon