Picture for Clayton Sanford

Clayton Sanford

Fast attention mechanisms: a tale of parallelism

Add code
Sep 10, 2025
Viaarxiv icon

When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective

Add code
Mar 14, 2025
Viaarxiv icon

Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers

Add code
Mar 03, 2025
Figure 1 for Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers
Figure 2 for Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers
Figure 3 for Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers
Figure 4 for Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers
Viaarxiv icon

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

Add code
Nov 23, 2024
Viaarxiv icon

One-layer transformers fail to solve the induction heads task

Add code
Aug 26, 2024
Viaarxiv icon

Understanding Transformer Reasoning Capabilities via Graph Algorithms

Add code
May 28, 2024
Viaarxiv icon

Transformers, parallel computation, and logarithmic depth

Add code
Feb 14, 2024
Viaarxiv icon

Representational Strengths and Limitations of Transformers

Add code
Jun 05, 2023
Figure 1 for Representational Strengths and Limitations of Transformers
Figure 2 for Representational Strengths and Limitations of Transformers
Figure 3 for Representational Strengths and Limitations of Transformers
Figure 4 for Representational Strengths and Limitations of Transformers
Viaarxiv icon

Learning Single-Index Models with Shallow Neural Networks

Add code
Oct 27, 2022
Figure 1 for Learning Single-Index Models with Shallow Neural Networks
Figure 2 for Learning Single-Index Models with Shallow Neural Networks
Viaarxiv icon

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

Add code
Oct 11, 2022
Figure 1 for On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Figure 2 for On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Figure 3 for On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Figure 4 for On Scrambling Phenomena for Randomly Initialized Recurrent Networks
Viaarxiv icon