Picture for Susan Zhang

Susan Zhang

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Add code
Sep 05, 2023
Viaarxiv icon

LIMA: Less Is More for Alignment

Add code
May 18, 2023
Viaarxiv icon

A Theory on Adam Instability in Large-Scale Machine Learning

Add code
Apr 25, 2023
Viaarxiv icon

Effective Theory of Transformers at Initialization

Add code
Apr 04, 2023
Viaarxiv icon

Scaling Laws for Generative Mixed-Modal Language Models

Add code
Jan 10, 2023
Viaarxiv icon

OPT: Open Pre-trained Transformer Language Models

Add code
May 05, 2022
Figure 1 for OPT: Open Pre-trained Transformer Language Models
Figure 2 for OPT: Open Pre-trained Transformer Language Models
Figure 3 for OPT: Open Pre-trained Transformer Language Models
Figure 4 for OPT: Open Pre-trained Transformer Language Models
Viaarxiv icon

Long-Term Planning and Situational Awareness in OpenAI Five

Add code
Dec 13, 2019
Figure 1 for Long-Term Planning and Situational Awareness in OpenAI Five
Figure 2 for Long-Term Planning and Situational Awareness in OpenAI Five
Figure 3 for Long-Term Planning and Situational Awareness in OpenAI Five
Figure 4 for Long-Term Planning and Situational Awareness in OpenAI Five
Viaarxiv icon

Neural Network Surgery with Sets

Add code
Dec 13, 2019
Figure 1 for Neural Network Surgery with Sets
Figure 2 for Neural Network Surgery with Sets
Figure 3 for Neural Network Surgery with Sets
Figure 4 for Neural Network Surgery with Sets
Viaarxiv icon

Dota 2 with Large Scale Deep Reinforcement Learning

Add code
Dec 13, 2019
Figure 1 for Dota 2 with Large Scale Deep Reinforcement Learning
Figure 2 for Dota 2 with Large Scale Deep Reinforcement Learning
Figure 3 for Dota 2 with Large Scale Deep Reinforcement Learning
Figure 4 for Dota 2 with Large Scale Deep Reinforcement Learning
Viaarxiv icon