Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Iris Zhang

TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training

Oct 09, 2024

Wanchao Liang, Tianyu Liu, Less Wright, Will Constable, Andrew Gu, Chien-Chin Huang, Iris Zhang, Wei Feng, Howard Huang, Junjie Wang(+3 more)

Figure 1 for TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training

Figure 2 for TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training

Figure 3 for TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training

Figure 4 for TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training

Abstract:The development of large language models (LLMs) has been instrumental in advancing state-of-the-art natural language processing applications. Training LLMs with billions of parameters and trillions of tokens require sophisticated distributed systems that enable composing and comparing several state-of-the-art techniques in order to efficiently scale across thousands of accelerators. However, existing solutions are complex, scattered across multiple libraries/repositories, lack interoperability, and are cumbersome to maintain. Thus, curating and empirically comparing training recipes require non-trivial engineering effort. This paper introduces TorchTitan, an open-source, PyTorch-native distributed training system that unifies state-of-the-art techniques, streamlining integration and reducing overhead. TorchTitan enables 3D parallelism in a modular manner with elastic scaling, providing comprehensive logging, checkpointing, and debugging tools for production-ready training. It also incorporates hardware-software co-designed solutions, leveraging features like Float8 training and SymmetricMemory. As a flexible test bed, TorchTitan facilitates custom recipe curation and comparison, allowing us to develop optimized training recipes for Llama 3.1 and provide guidance on selecting techniques for maximum efficiency based on our experiences. We thoroughly assess TorchTitan on the Llama 3.1 family of LLMs, spanning 8 billion to 405 billion parameters, and showcase its exceptional performance, modular composability, and elastic scalability. By stacking training optimizations, we demonstrate accelerations of 65.08% with 1D parallelism at the 128-GPU scale (Llama 3.1 8B), an additional 12.59% with 2D parallelism at the 256-GPU scale (Llama 3.1 70B), and an additional 30% with 3D parallelism at the 512-GPU scale (Llama 3.1 405B) on NVIDIA H100 GPUs over optimized baselines.

Via

Access Paper or Ask Questions

Executing Instructions in Situated Collaborative Interactions

Oct 17, 2019

Alane Suhr, Claudia Yan, Jacob Schluger, Stanley Yu, Hadi Khader, Marwa Mouallem, Iris Zhang, Yoav Artzi

Figure 1 for Executing Instructions in Situated Collaborative Interactions

Figure 2 for Executing Instructions in Situated Collaborative Interactions

Figure 3 for Executing Instructions in Situated Collaborative Interactions

Figure 4 for Executing Instructions in Situated Collaborative Interactions

Abstract:We study a collaborative scenario where a user not only instructs a system to complete tasks, but also acts alongside it. This allows the user to adapt to the system abilities by changing their language or deciding to simply accomplish some tasks themselves, and requires the system to effectively recover from errors as the user strategically assigns it new goals. We build a game environment to study this scenario, and learn to map user instructions to system actions. We introduce a learning approach focused on recovery from cascading errors between instructions, and modeling methods to explicitly reason about instructions with multiple goals. We evaluate with a new evaluation protocol using recorded interactions and online games with human users, and observe how users adapt to the system abilities.

* EMNLP 2019 long paper

Via

Access Paper or Ask Questions

A Corpus for Reasoning About Natural Language Grounded in Photographs

Nov 01, 2018

Alane Suhr, Stephanie Zhou, Iris Zhang, Huajun Bai, Yoav Artzi

Figure 1 for A Corpus for Reasoning About Natural Language Grounded in Photographs

Figure 2 for A Corpus for Reasoning About Natural Language Grounded in Photographs

Figure 3 for A Corpus for Reasoning About Natural Language Grounded in Photographs

Figure 4 for A Corpus for Reasoning About Natural Language Grounded in Photographs

Abstract:We introduce a new dataset for joint reasoning about language and vision. The data contains 107,296 examples of English sentences paired with web photographs. The task is to determine whether a natural language caption is true about a photograph. We present an approach for finding visually complex images and crowdsourcing linguistically diverse captions. Qualitative analysis shows the data requires complex reasoning about quantities, comparisons, and relationships between objects. Evaluation of state-of-the-art visual reasoning methods shows the data is a challenge for current methods.

Via

Access Paper or Ask Questions