Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zachary Tatlock

Magic Markup: Maintaining Document-External Markup with an LLM

Mar 06, 2024

Edward Misback, Zachary Tatlock, Steven L. Tanimoto

Abstract:Text documents, including programs, typically have human-readable semantic structure. Historically, programmatic access to these semantics has required explicit in-document tagging. Especially in systems where the text has an execution semantics, this means it is an opt-in feature that is hard to support properly. Today, language models offer a new method: metadata can be bound to entities in changing text using a model's human-like understanding of semantics, with no requirements on the document structure. This method expands the applications of document annotation, a fundamental operation in program writing, debugging, maintenance, and presentation. We contribute a system that employs an intelligent agent to re-tag modified programs, enabling rich annotations to automatically follow code as it evolves. We also contribute a formal problem definition, an empirical synthetic benchmark suite, and our benchmark generator. Our system achieves an accuracy of 90% on our benchmarks and can replace a document's tags in parallel at a rate of 5 seconds per tag. While there remains significant room for improvement, we find performance reliable enough to justify further exploration of applications.

* 10 pages; 2 figures; to be published in the <Programming> 2024 Conference Companion

Via

Access Paper or Ask Questions

Dynamic Tensor Rematerialization

Jun 18, 2020

Marisa Kirisame, Steven Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared Roesch, Tianqi Chen, Zachary Tatlock

Figure 1 for Dynamic Tensor Rematerialization

Figure 2 for Dynamic Tensor Rematerialization

Figure 3 for Dynamic Tensor Rematerialization

Figure 4 for Dynamic Tensor Rematerialization

Abstract:Checkpointing enables training larger models by freeing intermediate activations and recomputing them on demand. Previous checkpointing techniques are difficult to generalize to dynamic models because they statically plan recomputations offline. We present Dynamic Tensor Rematerialization (DTR), a greedy online algorithm for heuristically checkpointing arbitrary models. DTR is extensible and general: it is parameterized by an eviction policy and only collects lightweight metadata on tensors and operators. Though DTR has no advance knowledge of the model or training task, we prove it can train an $N$-layer feedforward network on an $\Omega(\sqrt{N})$ memory budget with only $\mathcal{O}(N)$ tensor operations. Moreover, we identify a general eviction heuristic and show how it allows DTR to automatically provide favorable checkpointing performance across a variety of models and memory budgets.

* 28 pages, 11 figures, implementation available here: https://github.com/uwsampl/dtr-prototype

Via

Access Paper or Ask Questions

Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Jun 04, 2020

Haichen Shen, Jared Roesch, Zhi Chen, Wei Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, Yida Wang

Figure 1 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Figure 2 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Figure 3 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Figure 4 for Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference

Abstract:Modern deep neural networks increasingly make use of features such as dynamic control flow, data structures and dynamic tensor shapes. Existing deep learning systems focus on optimizing and executing static neural networks which assume a pre-determined model architecture and input data shapes--assumptions which are violated by dynamic neural networks. Therefore, executing dynamic models with deep learning systems is currently both inflexible and sub-optimal, if not impossible. Optimizing dynamic neural networks is more challenging than static neural networks; optimizations must consider all possible execution paths and tensor shapes. This paper proposes Nimble, a high-performance and flexible system to optimize, compile, and execute dynamic neural networks on multiple platforms. Nimble handles model dynamism by introducing a dynamic type system, a set of dynamism-oriented optimizations, and a light-weight virtual machine runtime. Our evaluation demonstrates that Nimble outperforms state-of-the-art deep learning frameworks and runtime systems for dynamic neural networks by up to 20x on hardware platforms including Intel CPUs, ARM CPUs, and Nvidia GPUs.

Via

Access Paper or Ask Questions

Relay: A High-Level IR for Deep Learning

Apr 17, 2019

Jared Roesch, Steven Lyubomirsky, Marisa Kirisame, Josh Pollock, Logan Weber, Ziheng Jiang, Tianqi Chen, Thierry Moreau, Zachary Tatlock

Figure 1 for Relay: A High-Level IR for Deep Learning

Figure 2 for Relay: A High-Level IR for Deep Learning

Figure 3 for Relay: A High-Level IR for Deep Learning

Figure 4 for Relay: A High-Level IR for Deep Learning

Abstract:Frameworks for writing, compiling, and optimizing deep learning (DL) models have recently enabled progress in areas like computer vision and natural language processing. Extending these frameworks to accommodate the rapidly diversifying landscape of DL models and hardware platforms presents challenging tradeoffs between expressiveness, composability, and portability. We present Relay, a new intermediate representation (IR) and compiler framework for DL models. The functional, statically-typed Relay IR unifies and generalizes existing DL IRs and can express state-of-the-art models. Relay's expressive IR required careful design of the type system, automatic differentiation, and optimizations. Relay's extensible compiler can eliminate abstraction overhead and target new hardware platforms. The design insights from Relay can be applied to existing frameworks to develop IRs that support extension without compromising on expressivity, composibility, and portability. Our evaluation demonstrates that the Relay prototype can already provide competitive performance for a broad class of models running on CPUs, GPUs, and FPGAs.

Via

Access Paper or Ask Questions

Relay: A New IR for Machine Learning Frameworks

Sep 26, 2018

Jared Roesch, Steven Lyubomirsky, Logan Weber, Josh Pollock, Marisa Kirisame, Tianqi Chen, Zachary Tatlock

Figure 1 for Relay: A New IR for Machine Learning Frameworks

Figure 2 for Relay: A New IR for Machine Learning Frameworks

Figure 3 for Relay: A New IR for Machine Learning Frameworks

Figure 4 for Relay: A New IR for Machine Learning Frameworks

Abstract:Machine learning powers diverse services in industry including search, translation, recommendation systems, and security. The scale and importance of these models require that they be efficient, expressive, and portable across an array of heterogeneous hardware devices. These constraints are often at odds; in order to better accommodate them we propose a new high-level intermediate representation (IR) called Relay. Relay is being designed as a purely-functional, statically-typed language with the goal of balancing efficient compilation, expressiveness, and portability. We discuss the goals of Relay and highlight its important design constraints. Our prototype is part of the open source NNVM compiler framework, which powers Amazon's deep learning framework MxNet.

Via

Access Paper or Ask Questions