Picture for Tokio Kajitsuka

Tokio Kajitsuka

Optimal Memorization Capacity of Transformers

Add code
Sep 26, 2024
Figure 1 for Optimal Memorization Capacity of Transformers
Viaarxiv icon

Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?

Add code
Jul 26, 2023
Viaarxiv icon