Picture for Srinivasan Iyer

Srinivasan Iyer

Byte Latent Transformer: Patches Scale Better Than Tokens

Add code
Dec 13, 2024
Viaarxiv icon

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Add code
Nov 07, 2024
Figure 1 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 2 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 3 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 4 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Viaarxiv icon

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Add code
Jul 31, 2024
Figure 1 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 2 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 3 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 4 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Viaarxiv icon

Instruction-tuned Language Models are Better Knowledge Learners

Add code
Feb 20, 2024
Viaarxiv icon

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

Add code
Dec 28, 2022
Viaarxiv icon

Complementary Explanations for Effective In-Context Learning

Add code
Nov 25, 2022
Viaarxiv icon

Efficient Large Scale Language Modeling with Mixtures of Experts

Add code
Dec 20, 2021
Figure 1 for Efficient Large Scale Language Modeling with Mixtures of Experts
Figure 2 for Efficient Large Scale Language Modeling with Mixtures of Experts
Figure 3 for Efficient Large Scale Language Modeling with Mixtures of Experts
Figure 4 for Efficient Large Scale Language Modeling with Mixtures of Experts
Viaarxiv icon

Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs

Add code
Nov 26, 2021
Figure 1 for Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Figure 2 for Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Figure 3 for Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Figure 4 for Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Viaarxiv icon

EASE: Extractive-Abstractive Summarization with Explanations

Add code
May 14, 2021
Figure 1 for EASE: Extractive-Abstractive Summarization with Explanations
Figure 2 for EASE: Extractive-Abstractive Summarization with Explanations
Figure 3 for EASE: Extractive-Abstractive Summarization with Explanations
Figure 4 for EASE: Extractive-Abstractive Summarization with Explanations
Viaarxiv icon

FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation

Add code
Dec 31, 2020
Figure 1 for FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Figure 2 for FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Figure 3 for FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Figure 4 for FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale Generation
Viaarxiv icon