Picture for Lili Yu

Lili Yu

Byte Latent Transformer: Patches Scale Better Than Tokens

Add code
Dec 13, 2024
Viaarxiv icon

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Add code
Nov 07, 2024
Figure 1 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 2 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 3 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Figure 4 for Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Viaarxiv icon

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Add code
Aug 20, 2024
Viaarxiv icon

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Add code
Apr 12, 2024
Viaarxiv icon

Jointly Training Large Autoregressive Multimodal Models

Add code
Sep 28, 2023
Figure 1 for Jointly Training Large Autoregressive Multimodal Models
Figure 2 for Jointly Training Large Autoregressive Multimodal Models
Figure 3 for Jointly Training Large Autoregressive Multimodal Models
Figure 4 for Jointly Training Large Autoregressive Multimodal Models
Viaarxiv icon

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Sep 05, 2023
Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

Add code
May 19, 2023
Figure 1 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Figure 2 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Figure 3 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Figure 4 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Viaarxiv icon

LIMA: Less Is More for Alignment

Add code
May 18, 2023
Viaarxiv icon

VideoOFA: Two-Stage Pre-Training for Video-to-Text Generation

Add code
May 04, 2023
Viaarxiv icon

Scaling Laws for Generative Mixed-Modal Language Models

Add code
Jan 10, 2023
Viaarxiv icon