Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nick Weston

SMASH: One-Shot Model Architecture Search through HyperNetworks

Aug 17, 2017

Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

Figure 1 for SMASH: One-Shot Model Architecture Search through HyperNetworks

Figure 2 for SMASH: One-Shot Model Architecture Search through HyperNetworks

Figure 3 for SMASH: One-Shot Model Architecture Search through HyperNetworks

Figure 4 for SMASH: One-Shot Model Architecture Search through HyperNetworks

Abstract:Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at https://github.com/ajbrock/SMASH

Via

Access Paper or Ask Questions

FreezeOut: Accelerate Training by Progressively Freezing Layers

Jun 18, 2017

Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

Figure 1 for FreezeOut: Accelerate Training by Progressively Freezing Layers

Figure 2 for FreezeOut: Accelerate Training by Progressively Freezing Layers

Figure 3 for FreezeOut: Accelerate Training by Progressively Freezing Layers

Figure 4 for FreezeOut: Accelerate Training by Progressively Freezing Layers

Abstract:The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at https://github.com/ajbrock/FreezeOut

* Extended Abstract

Via

Access Paper or Ask Questions

Neural Photo Editing with Introspective Adversarial Networks

Feb 06, 2017

Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

Figure 1 for Neural Photo Editing with Introspective Adversarial Networks

Figure 2 for Neural Photo Editing with Introspective Adversarial Networks

Figure 3 for Neural Photo Editing with Introspective Adversarial Networks

Figure 4 for Neural Photo Editing with Introspective Adversarial Networks

Abstract:The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature quality, we introduce the Introspective Adversarial Network, a novel hybridization of the VAE and GAN. Our model efficiently captures long-range dependencies through use of a computational block based on weight-shared dilated convolutions, and improves generalization performance with Orthogonal Regularization, a novel weight regularization method. We validate our contributions on CelebA, SVHN, and CIFAR-100, and produce samples and reconstructions with high visual fidelity.

* 10 pages, 7 figures, 3 tables

Via

Access Paper or Ask Questions

Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Aug 16, 2016

Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

Figure 1 for Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Figure 2 for Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Figure 3 for Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Figure 4 for Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Abstract:When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.

* 9 pages, 5 figures, 2 tables

Via

Access Paper or Ask Questions