Picture for Albert Gu

Albert Gu

On the Benefits of Memory for Modeling Time-Dependent PDEs

Add code
Sep 03, 2024
Viaarxiv icon

Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models

Add code
Aug 19, 2024
Viaarxiv icon

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers

Add code
Jul 13, 2024
Viaarxiv icon

An Empirical Study of Mamba-based Language Models

Add code
Jun 12, 2024
Figure 1 for An Empirical Study of Mamba-based Language Models
Figure 2 for An Empirical Study of Mamba-based Language Models
Figure 3 for An Empirical Study of Mamba-based Language Models
Figure 4 for An Empirical Study of Mamba-based Language Models
Viaarxiv icon

Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality

Add code
May 31, 2024
Viaarxiv icon

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

Add code
Mar 05, 2024
Figure 1 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Figure 2 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Figure 3 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Figure 4 for Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Viaarxiv icon

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Add code
Feb 29, 2024
Viaarxiv icon

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Add code
Dec 01, 2023
Viaarxiv icon

Augmenting conformers with structured state space models for online speech recognition

Add code
Sep 15, 2023
Figure 1 for Augmenting conformers with structured state space models for online speech recognition
Figure 2 for Augmenting conformers with structured state space models for online speech recognition
Figure 3 for Augmenting conformers with structured state space models for online speech recognition
Figure 4 for Augmenting conformers with structured state space models for online speech recognition
Viaarxiv icon

Resurrecting Recurrent Neural Networks for Long Sequences

Add code
Mar 11, 2023
Viaarxiv icon