Picture for James Lee-Thorp

James Lee-Thorp

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Memory Augmented Language Models through Mixture of Word Experts

Add code
Nov 15, 2023
Figure 1 for Memory Augmented Language Models through Mixture of Word Experts
Figure 2 for Memory Augmented Language Models through Mixture of Word Experts
Figure 3 for Memory Augmented Language Models through Mixture of Word Experts
Figure 4 for Memory Augmented Language Models through Mixture of Word Experts
Viaarxiv icon

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Add code
May 22, 2023
Viaarxiv icon

CoLT5: Faster Long-Range Transformers with Conditional Computation

Add code
Mar 17, 2023
Viaarxiv icon

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Add code
Dec 09, 2022
Figure 1 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 2 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 3 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Figure 4 for Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Viaarxiv icon

Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT

Add code
May 24, 2022
Figure 1 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Figure 2 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Figure 3 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Figure 4 for Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Viaarxiv icon

Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$

Add code
Mar 31, 2022
Figure 1 for Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$
Figure 2 for Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$
Viaarxiv icon

ShopTalk: A System for Conversational Faceted Search

Add code
Sep 02, 2021
Figure 1 for ShopTalk: A System for Conversational Faceted Search
Figure 2 for ShopTalk: A System for Conversational Faceted Search
Figure 3 for ShopTalk: A System for Conversational Faceted Search
Figure 4 for ShopTalk: A System for Conversational Faceted Search
Viaarxiv icon

FNet: Mixing Tokens with Fourier Transforms

Add code
May 09, 2021
Figure 1 for FNet: Mixing Tokens with Fourier Transforms
Figure 2 for FNet: Mixing Tokens with Fourier Transforms
Figure 3 for FNet: Mixing Tokens with Fourier Transforms
Figure 4 for FNet: Mixing Tokens with Fourier Transforms
Viaarxiv icon