Picture for Armen Aghajanyan

Armen Aghajanyan

When Worse is Better: Navigating the compression-generation tradeoff in visual tokenization

Add code
Dec 20, 2024
Viaarxiv icon

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Add code
Jul 31, 2024
Figure 1 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 2 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 3 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Figure 4 for MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
Viaarxiv icon

Small Molecule Optimization with Large Language Models

Add code
Jul 26, 2024
Viaarxiv icon

Text Quality-Based Pruning for Efficient Training of Language Models

Add code
Apr 26, 2024
Viaarxiv icon

DOMINO: A Dual-System for Multi-step Visual Language Reasoning

Add code
Oct 04, 2023
Viaarxiv icon

Jointly Training Large Autoregressive Multimodal Models

Add code
Sep 28, 2023
Figure 1 for Jointly Training Large Autoregressive Multimodal Models
Figure 2 for Jointly Training Large Autoregressive Multimodal Models
Figure 3 for Jointly Training Large Autoregressive Multimodal Models
Figure 4 for Jointly Training Large Autoregressive Multimodal Models
Viaarxiv icon

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Sep 05, 2023
Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Add code
Aug 23, 2023
Figure 1 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 2 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 3 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 4 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Viaarxiv icon

MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers

Add code
May 19, 2023
Figure 1 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Figure 2 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Figure 3 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Figure 4 for MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
Viaarxiv icon

Scaling Laws for Generative Mixed-Modal Language Models

Add code
Jan 10, 2023
Viaarxiv icon