Picture for Xi Victoria Lin

Xi Victoria Lin

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Add code
Nov 07, 2024
Figure 1 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 2 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 3 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 4 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Viaarxiv icon

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Add code
Jul 31, 2024
Figure 1 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 2 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 3 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 4 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Viaarxiv icon

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

Add code
May 29, 2024
Viaarxiv icon

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Add code
Mar 12, 2024
Figure 1 for Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Figure 2 for Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Figure 3 for Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Figure 4 for Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Viaarxiv icon

Instruction-tuned Language Models are Better Knowledge Learners

Add code
Feb 20, 2024
Viaarxiv icon

In-Context Pretraining: Language Modeling Beyond Document Boundaries

Add code
Oct 20, 2023
Viaarxiv icon

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

Add code
Oct 08, 2023
Figure 1 for RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Figure 2 for RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Figure 3 for RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Figure 4 for RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Viaarxiv icon

Reimagining Retrieval Augmented Language Models for Answering Queries

Add code
Jun 01, 2023
Viaarxiv icon

Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model

Add code
May 23, 2023
Figure 1 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Figure 2 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Figure 3 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Figure 4 for Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
Viaarxiv icon

LEVER: Learning to Verify Language-to-Code Generation with Execution

Add code
Feb 16, 2023
Viaarxiv icon