Picture for Róbert Csordás

Róbert Csordás

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Add code
Oct 28, 2024
Viaarxiv icon

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations

Add code
Aug 20, 2024
Figure 1 for Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
Figure 2 for Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
Figure 3 for Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
Figure 4 for Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
Viaarxiv icon

MoEUT: Mixture-of-Experts Universal Transformers

Add code
May 25, 2024
Viaarxiv icon

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention

Add code
Dec 14, 2023
Figure 1 for SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Figure 2 for SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Figure 3 for SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Figure 4 for SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
Viaarxiv icon

Automating Continual Learning

Add code
Dec 01, 2023
Viaarxiv icon

Practical Computational Power of Linear Transformers and Their Recurrent and Self-Referential Extensions

Add code
Oct 24, 2023
Viaarxiv icon

Approximating Two-Layer Feedforward Networks for Efficient Transformers

Add code
Oct 23, 2023
Viaarxiv icon

Mindstorms in Natural Language-Based Societies of Mind

Add code
May 26, 2023
Figure 1 for Mindstorms in Natural Language-Based Societies of Mind
Figure 2 for Mindstorms in Natural Language-Based Societies of Mind
Figure 3 for Mindstorms in Natural Language-Based Societies of Mind
Figure 4 for Mindstorms in Natural Language-Based Societies of Mind
Viaarxiv icon

Randomized Positional Encodings Boost Length Generalization of Transformers

Add code
May 26, 2023
Viaarxiv icon

Topological Neural Discrete Representation Learning à la Kohonen

Add code
Feb 15, 2023
Viaarxiv icon