Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eddie Yan

Characterizing and Taming Resolution in Convolutional Neural Networks

Oct 28, 2021

Eddie Yan, Liang Luo, Luis Ceze

Figure 1 for Characterizing and Taming Resolution in Convolutional Neural Networks

Figure 2 for Characterizing and Taming Resolution in Convolutional Neural Networks

Figure 3 for Characterizing and Taming Resolution in Convolutional Neural Networks

Figure 4 for Characterizing and Taming Resolution in Convolutional Neural Networks

Abstract:Image resolution has a significant effect on the accuracy and computational, storage, and bandwidth costs of computer vision model inference. These costs are exacerbated when scaling out models to large inference serving systems and make image resolution an attractive target for optimization. However, the choice of resolution inherently introduces additional tightly coupled choices, such as image crop size, image detail, and compute kernel implementation that impact computational, storage, and bandwidth costs. Further complicating this setting, the optimal choices from the perspective of these metrics are highly dependent on the dataset and problem scenario. We characterize this tradeoff space, quantitatively studying the accuracy and efficiency tradeoff via systematic and automated tuning of image resolution, image quality and convolutional neural network operators. With the insights from this study, we propose a dynamic resolution mechanism that removes the need to statically choose a resolution ahead of time.

Via

Access Paper or Ask Questions

Do CNNs Encode Data Augmentations?

Feb 29, 2020

Eddie Yan, Yanping Huang

Figure 1 for Do CNNs Encode Data Augmentations?

Figure 2 for Do CNNs Encode Data Augmentations?

Figure 3 for Do CNNs Encode Data Augmentations?

Figure 4 for Do CNNs Encode Data Augmentations?

Abstract:Data augmentations are an important ingredient in the recipe for training robust neural networks, especially in computer vision. A fundamental question is whether neural network features explicitly encode data augmentation transformations. To answer this question, we introduce a systematic approach to investigate which layers of neural networks are the most predictive of augmentation transformations. Our approach uses layer features in pre-trained vision models with minimal additional processing to predict common properties transformed by augmentation (scale, aspect ratio, hue, saturation, contrast, brightness). Surprisingly, neural network features not only predict data augmentation transformations, but they predict many transformations with high accuracy. After validating that neural networks encode features corresponding to augmentation transformations, we show that these features are primarily encoded in the early layers of modern CNNs.

Via

Access Paper or Ask Questions

Learning to Optimize Tensor Programs

Oct 27, 2018

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy

Figure 1 for Learning to Optimize Tensor Programs

Figure 2 for Learning to Optimize Tensor Programs

Figure 3 for Learning to Optimize Tensor Programs

Figure 4 for Learning to Optimize Tensor Programs

Abstract:We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learning systems. However, existing systems rely on manually optimized libraries such as cuDNN where only a narrow range of server class GPUs are well-supported. The reliance on hardware-specific operator libraries limits the applicability of high-level graph optimizations and incurs significant engineering costs when deploying to new hardware targets. We use learning to remove this engineering burden. We learn domain-specific statistical cost models to guide the search of tensor operator implementations over billions of possible program variants. We further accelerate the search by effective model transfer across workloads. Experimental results show that our framework delivers performance competitive with state-of-the-art hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPU.

* NIPS 2018

Via

Access Paper or Ask Questions

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Oct 05, 2018

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze(+2 more)

Figure 1 for TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Figure 2 for TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Figure 3 for TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Figure 4 for TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Abstract:There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms -- such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) -- requires significant manual effort. We propose TVM, a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives, and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of code optimizations. Experimental results show that TVM delivers performance across hardware back-ends that are competitive with state-of-the-art, hand-tuned libraries for low-power CPU, mobile GPU, and server-class GPUs. We also demonstrate TVM's ability to target new accelerator back-ends, such as the FPGA-based generic deep learning accelerator. The system is open sourced and in production use inside several major companies.

* Significantly improved version, add automated optimization

Via

Access Paper or Ask Questions