Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Klawonn

DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Jan 02, 2025

Mason Lary, Richard Samuelson, Alexander Wilentz, Alina Zare, Matthew Klawonn, James P. Fairbanks

Figure 1 for DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Figure 2 for DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Figure 3 for DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Figure 4 for DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Abstract:Motivated by deep learning regimes with multiple interacting yet distinct model components, we introduce learning diagrams, graphical depictions of training setups that capture parameterized learning as data rather than code. A learning diagram compiles to a unique loss function on which component models are trained. The result of training on this loss is a collection of models whose predictions ``agree" with one another. We show that a number of popular learning setups such as few-shot multi-task learning, knowledge distillation, and multi-modal learning can be depicted as learning diagrams. We further implement learning diagrams in a library that allows users to build diagrams of PyTorch and Flux.jl models. By implementing some classic machine learning use cases, we demonstrate how learning diagrams allow practitioners to build complicated models as compositions of smaller components, identify relationships between workflows, and manipulate models during or after training. Leveraging a category theoretic framework, we introduce a rigorous semantics for learning diagrams that puts such operations on a firm mathematical foundation.

Via

Access Paper or Ask Questions

Generalized Gradient Descent is a Hypergraph Functor

Mar 28, 2024

Tyler Hanks, Matthew Klawonn, James Fairbanks

Abstract:Cartesian reverse derivative categories (CRDCs) provide an axiomatic generalization of the reverse derivative, which allows generalized analogues of classic optimization algorithms such as gradient descent to be applied to a broad class of problems. In this paper, we show that generalized gradient descent with respect to a given CRDC induces a hypergraph functor from a hypergraph category of optimization problems to a hypergraph category of dynamical systems. The domain of this functor consists of objective functions that are 1) general in the sense that they are defined with respect to an arbitrary CRDC, and 2) open in that they are decorated spans that can be composed with other such objective functions via variable sharing. The codomain is specified analogously as a category of general and open dynamical systems for the underlying CRDC. We describe how the hypergraph functor induces a distributed optimization algorithm for arbitrary composite problems specified in the domain. To illustrate the kinds of problems our framework can model, we show that parameter sharing models in multitask learning, a prevalent machine learning paradigm, yield a composite optimization problem for a given choice of CRDC. We then apply the gradient descent functor to this composite problem and describe the resulting distributed gradient descent algorithm for training parameter sharing models.

Via

Access Paper or Ask Questions

Exploiting Class Learnability in Noisy Data

Nov 15, 2018

Matthew Klawonn, Eric Heim, James Hendler

Figure 1 for Exploiting Class Learnability in Noisy Data

Figure 2 for Exploiting Class Learnability in Noisy Data

Figure 3 for Exploiting Class Learnability in Noisy Data

Figure 4 for Exploiting Class Learnability in Noisy Data

Abstract:In many domains, collecting sufficient labeled training data for supervised machine learning requires easily accessible but noisy sources, such as crowdsourcing services or tagged Web data. Noisy labels occur frequently in data sets harvested via these means, sometimes resulting in entire classes of data on which learned classifiers generalize poorly. For real world applications, we argue that it can be beneficial to avoid training on such classes entirely. In this work, we aim to explore the classes in a given data set, and guide supervised training to spend time on a class proportional to its learnability. By focusing the training process, we aim to improve model generalization on classes with a strong signal. To that end, we develop an online algorithm that works in conjunction with classifier and training algorithm, iteratively selecting training data for the classifier based on how well it appears to generalize on each class. Testing our approach on a variety of data sets, we show our algorithm learns to focus on classes for which the model has low generalization error relative to strong baselines, yielding a classifier with good performance on learnable classes.

* Accepted to AAAI 2019

Via

Access Paper or Ask Questions

Generating Triples with Adversarial Networks for Scene Graph Construction

Feb 07, 2018

Matthew Klawonn, Eric Heim

Figure 1 for Generating Triples with Adversarial Networks for Scene Graph Construction

Figure 2 for Generating Triples with Adversarial Networks for Scene Graph Construction

Figure 3 for Generating Triples with Adversarial Networks for Scene Graph Construction

Figure 4 for Generating Triples with Adversarial Networks for Scene Graph Construction

Abstract:Driven by successes in deep learning, computer vision research has begun to move beyond object detection and image classification to more sophisticated tasks like image captioning or visual question answering. Motivating such endeavors is the desire for models to capture not only objects present in an image, but more fine-grained aspects of a scene such as relationships between objects and their attributes. Scene graphs provide a formal construct for capturing these aspects of an image. Despite this, there have been only a few recent efforts to generate scene graphs from imagery. Previous works limit themselves to settings where bounding box information is available at train time and do not attempt to generate scene graphs with attributes. In this paper we propose a method, based on recent advancements in Generative Adversarial Networks, to overcome these deficiencies. We take the approach of first generating small subgraphs, each describing a single statement about a scene from a specific region of the input image chosen using an attention mechanism. By doing so, our method is able to produce portions of the scene graphs with attribute information without the need for bounding box labels. Then, the complete scene graph is constructed from these subgraphs. We show that our model improves upon prior work in scene graph generation on state-of-the-art data sets and accepted metrics. Further, we demonstrate that our model is capable of handling a larger vocabulary size than prior work has attempted.

* Accepted to AAAI 2018

Via

Access Paper or Ask Questions