Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jasmine Collins

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Oct 25, 2023

Aaron Gokaslan, A. Feder Cooper, Jasmine Collins, Landan Seguin, Austin Jacobson, Mihir Patel, Jonathan Frankle, Cory Stephenson, Volodymyr Kuleshov

Figure 1 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Figure 2 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Figure 3 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Figure 4 for CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

Abstract:We assemble a dataset of Creative-Commons-licensed (CC) images, which we use to train a set of open diffusion models that are qualitatively competitive with Stable Diffusion 2 (SD2). This task presents two challenges: (1) high-resolution CC images lack the captions necessary to train text-to-image generative models; (2) CC images are relatively scarce. In turn, to address these challenges, we use an intuitive transfer learning technique to produce a set of high-quality synthetic captions paired with curated CC images. We then develop a data- and compute-efficient training recipe that requires as little as 3% of the LAION-2B data needed to train existing SD2 models, but obtains comparable quality. These results indicate that we have a sufficient number of CC images (~70 million) for training high-quality models. Our training recipe also implements a variety of optimizations that achieve ~3X training speed-ups, enabling rapid model iteration. We leverage this recipe to train several high-quality text-to-image models, which we dub the CommonCanvas family. Our largest model achieves comparable performance to SD2 on a human evaluation, despite being trained on our CC dataset that is significantly smaller than LAION and using synthetic captions for training. We release our models, data, and code at https://github.com/mosaicml/diffusion/blob/main/assets/common-canvas.md

Via

Access Paper or Ask Questions

CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single Image

Jan 05, 2023

Jasmine Collins, Anqi Liang, Jitendra Malik, Hao Zhang, Frédéric Devernay

Abstract:We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model. Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image. The network is composed of three distinct branches that take a shared joint image-shape embedding and is trained end-to-end. Unlike previous methods, our approach is independent of the topology of the object and can work with objects from arbitrary categories. Our method, trained with only synthetic data, can be used to automatically animate a mesh, infer motion from real images, and transfer articulation to functionally similar but geometrically distinct 3D models at test time.

* 8 pages

Via

Access Paper or Ask Questions

Towards Understanding How Machines Can Learn Causal Overhypotheses

Jun 16, 2022

Eliza Kosoy, David M. Chan, Adrian Liu, Jasmine Collins, Bryanna Kaufmann, Sandy Han Huang, Jessica B. Hamrick, John Canny, Nan Rosemary Ke, Alison Gopnik

Figure 1 for Towards Understanding How Machines Can Learn Causal Overhypotheses

Figure 2 for Towards Understanding How Machines Can Learn Causal Overhypotheses

Figure 3 for Towards Understanding How Machines Can Learn Causal Overhypotheses

Figure 4 for Towards Understanding How Machines Can Learn Causal Overhypotheses

Abstract:Recent work in machine learning and cognitive science has suggested that understanding causal information is essential to the development of intelligence. The extensive literature in cognitive science using the ``blicket detector'' environment shows that children are adept at many kinds of causal inference and learning. We propose to adapt that environment for machine learning agents. One of the key challenges for current machine learning algorithms is modeling and understanding causal overhypotheses: transferable abstract hypotheses about sets of causal relationships. In contrast, even young children spontaneously learn and use causal overhypotheses. In this work, we present a new benchmark -- a flexible environment which allows for the evaluation of existing techniques under variable causal overhypotheses -- and demonstrate that many existing state-of-the-art methods have trouble generalizing in this environment. The code and resources for this benchmark are available at https://github.com/CannyLab/casual_overhypotheses.

Via

Access Paper or Ask Questions

Learning Causal Overhypotheses through Exploration in Children and Computational Models

Feb 21, 2022

Eliza Kosoy, Adrian Liu, Jasmine Collins, David M Chan, Jessica B Hamrick, Nan Rosemary Ke, Sandy H Huang, Bryanna Kaufmann, John Canny, Alison Gopnik

Figure 1 for Learning Causal Overhypotheses through Exploration in Children and Computational Models

Figure 2 for Learning Causal Overhypotheses through Exploration in Children and Computational Models

Figure 3 for Learning Causal Overhypotheses through Exploration in Children and Computational Models

Figure 4 for Learning Causal Overhypotheses through Exploration in Children and Computational Models

Abstract:Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research. Existing methods often focus on state-based metrics, which do not consider the underlying causal structures of the environment, and while recent research has begun to explore RL environments for causal learning, these environments primarily leverage causal information through causal inference or induction rather than exploration. In contrast, human children - some of the most proficient explorers - have been shown to use causal information to great benefit. In this work, we introduce a novel RL environment designed with a controllable causal structure, which allows us to evaluate exploration strategies used by both agents and children in a unified environment. In addition, through experimentation on both computation models and children, we demonstrate that there are significant differences between information-gain optimal RL exploration in causal environments and the exploration of children in the same environments. We conclude with a discussion of how these findings may inspire new directions of research into efficient exploration and disambiguation of causal structures for RL algorithms.

Via

Access Paper or Ask Questions

GANmouflage: 3D Object Nondetection with Texture Fields

Jan 18, 2022

Rui Guo, Jasmine Collins, Oscar de Lima, Andrew Owens

Figure 1 for GANmouflage: 3D Object Nondetection with Texture Fields

Figure 2 for GANmouflage: 3D Object Nondetection with Texture Fields

Figure 3 for GANmouflage: 3D Object Nondetection with Texture Fields

Figure 4 for GANmouflage: 3D Object Nondetection with Texture Fields

Abstract:We propose a method that learns to camouflage 3D objects within scenes. Given an object's shape and a distribution of viewpoints from which it will be seen, we estimate a texture that will make it difficult to detect. Successfully solving this task requires a model that can accurately reproduce textures from the scene, while simultaneously dealing with the highly conflicting constraints imposed by each viewpoint. We address these challenges with a model based on texture fields and adversarial learning. Our model learns to camouflage a variety of object shapes from randomly sampled locations and viewpoints within the input scene, and is the first to address the problem of hiding complex object shapes. Using a human visual search study, we find that our estimated textures conceal objects significantly better than previous methods. Project site: https://rrrrrguo.github.io/ganmouflage/

Via

Access Paper or Ask Questions

ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

Oct 12, 2021

Jasmine Collins, Shubham Goel, Achleshwar Luthra, Leon Xu, Kenan Deng, Xi Zhang, Tomas F. Yago Vicente, Himanshu Arora, Thomas Dideriksen, Matthieu Guillaumin(+1 more)

Figure 1 for ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

Figure 2 for ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

Figure 3 for ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

Figure 4 for ABO: Dataset and Benchmarks for Real-World 3D Object Understanding

Abstract:We introduce Amazon-Berkeley Objects (ABO), a new large-scale dataset of product images and 3D models corresponding to real household objects. We use this realistic, object-centric 3D dataset to measure the domain gap for single-view 3D reconstruction networks trained on synthetic objects. We also use multi-view images from ABO to measure the robustness of state-of-the-art metric learning approaches to different camera viewpoints. Finally, leveraging the physically-based rendering materials in ABO, we perform single- and multi-view material estimation for a variety of complex, real-world geometries. The full dataset is available for download at https://amazon-berkeley-objects.s3.amazonaws.com/index.html.

Via

Access Paper or Ask Questions

Exploring Exploration: Comparing Children with RL Agents in Unified Environments

May 06, 2020

Eliza Kosoy, Jasmine Collins, David M. Chan, Jessica B. Hamrick, Sandy Huang, Alison Gopnik, John Canny

Figure 1 for Exploring Exploration: Comparing Children with RL Agents in Unified Environments

Figure 2 for Exploring Exploration: Comparing Children with RL Agents in Unified Environments

Figure 3 for Exploring Exploration: Comparing Children with RL Agents in Unified Environments

Abstract:Research in developmental psychology consistently shows that children explore the world thoroughly and efficiently and that this exploration allows them to learn. In turn, this early learning supports more robust generalization and intelligent behavior later in life. While much work has gone into developing methods for exploration in machine learning, artificial agents have not yet reached the high standard set by their human counterparts. In this work we propose using DeepMind Lab (Beattie et al., 2016) as a platform to directly compare child and agent behaviors and to develop new exploration techniques. We outline two ongoing experiments to demonstrate the effectiveness of a direct comparison, and outline a number of open research questions that we believe can be tested using this methodology.

* Published as a workshop paper at "Bridging AI and Cognitive Science" (ICLR 2020)

Via

Access Paper or Ask Questions

Accelerating Training of Deep Neural Networks with a Standardization Loss

Mar 03, 2019

Jasmine Collins, Johannes Balle, Jonathon Shlens

Figure 1 for Accelerating Training of Deep Neural Networks with a Standardization Loss

Figure 2 for Accelerating Training of Deep Neural Networks with a Standardization Loss

Figure 3 for Accelerating Training of Deep Neural Networks with a Standardization Loss

Figure 4 for Accelerating Training of Deep Neural Networks with a Standardization Loss

Abstract:A significant advance in accelerating neural network training has been the development of normalization methods, permitting the training of deep models both faster and with better accuracy. These advances come with practical challenges: for instance, batch normalization ties the prediction of individual examples with other examples within a batch, resulting in a network that is heavily dependent on batch size. Layer normalization and group normalization are data-dependent and thus must be continually used, even at test-time. To address the issues that arise from using explicit normalization techniques, we propose to replace existing normalization methods with a simple, secondary objective loss that we term a standardization loss. This formulation is flexible and robust across different batch sizes and surprisingly, this secondary objective accelerates learning on the primary training objective. Because it is a training loss, it is simply removed at test-time, and no further effort is needed to maintain normalized activations. We find that a standardization loss accelerates training on both small- and large-scale image classification experiments, works with a variety of architectures, and is largely robust to training across different batch sizes.

* Technical report. Results presented at WiML 2018

Via

Access Paper or Ask Questions

Capacity and Trainability in Recurrent Neural Networks

Mar 03, 2017

Jasmine Collins, Jascha Sohl-Dickstein, David Sussillo

Figure 1 for Capacity and Trainability in Recurrent Neural Networks

Figure 2 for Capacity and Trainability in Recurrent Neural Networks

Figure 3 for Capacity and Trainability in Recurrent Neural Networks

Figure 4 for Capacity and Trainability in Recurrent Neural Networks

Abstract:Two potential bottlenecks on the expressiveness of recurrent neural networks (RNNs) are their ability to store information about the task in their parameters, and to store information about the input history in their units. We show experimentally that all common RNN architectures achieve nearly the same per-task and per-unit capacity bounds with careful training, for a variety of tasks and stacking depths. They can store an amount of task information which is linear in the number of parameters, and is approximately 5 bits per parameter. They can additionally store approximately one real number from their input history per hidden unit. We further find that for several tasks it is the per-task parameter capacity bound that determines performance. These results suggest that many previous results comparing RNN architectures are driven primarily by differences in training effectiveness, rather than differences in capacity. Supporting this observation, we compare training difficulty for several architectures, and show that vanilla RNNs are far more difficult to train, yet have slightly higher capacity. Finally, we propose two novel RNN architectures, one of which is easier to train than the LSTM or GRU for deeply stacked architectures.

* Published as a conference paper at ICLR 2017

Via

Access Paper or Ask Questions

Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning

Nov 04, 2016

Akosua Busia, Jasmine Collins, Navdeep Jaitly

Figure 1 for Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning

Figure 2 for Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning

Figure 3 for Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning

Figure 4 for Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning

Abstract:Recently developed deep learning techniques have significantly improved the accuracy of various speech and image recognition systems. In this paper we adapt some of these techniques for protein secondary structure prediction. We first train a series of deep neural networks to predict eight-class secondary structure labels given a protein's amino acid sequence information and find that using recent methods for regularization, such as dropout and weight-norm constraining, leads to measurable gains in accuracy. We then adapt recent convolutional neural network architectures--Inception, ReSNet, and DenseNet with Batch Normalization--to the problem of protein structure prediction. These convolutional architectures make heavy use of multi-scale filter layers that simultaneously compute features on several scales, and use residual connections to prevent underfitting. Using a carefully modified version of these architectures, we achieve state-of-the-art performance of 70.0% per amino acid accuracy on the public CB513 benchmark dataset. Finally, we explore additions from sequence-to-sequence learning, altering the model to make its predictions conditioned on both the protein's amino acid sequence and its past secondary structure labels. We introduce a new method of ensembling such a conditional model with our convolutional model, an approach which reaches 70.6% Q8 accuracy on CB513. We argue that these results can be further refined for larger boosts in prediction accuracy through more sophisticated attempts to control overfitting of conditional models. We aim to release the code for these experiments as part of the TensorFlow repository.

* 10 pages, 2 figures, submitted to RECOMB 2017

Via

Access Paper or Ask Questions