Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Halpern

One Size Does Not Fit All: Quantifying and Exposing the Accuracy-Latency Trade-off in Machine Learning Cloud Service APIs via Tolerance Tiers

Jun 26, 2019

Matthew Halpern, Behzad Boroujerdian, Todd Mummert, Evelyn Duesterwald, Vijay Janapa Reddi

Figure 1 for One Size Does Not Fit All: Quantifying and Exposing the Accuracy-Latency Trade-off in Machine Learning Cloud Service APIs via Tolerance Tiers

Abstract:Today's cloud service architectures follow a "one size fits all" deployment strategy where the same service version instantiation is provided to the end users. However, consumers are broad and different applications have different accuracy and responsiveness requirements, which as we demonstrate renders the "one size fits all" approach inefficient in practice. We use a production-grade speech recognition engine, which serves several thousands of users, and an open source computer vision based system, to explain our point. To overcome the limitations of the "one size fits all" approach, we recommend Tolerance Tiers where each MLaaS tier exposes an accuracy/responsiveness characteristic, and consumers can programmatically select a tier. We evaluate our proposal on the CPU-based automatic speech recognition (ASR) engine and cutting-edge neural networks for image classification deployed on both CPUs and GPUs. The results show that our proposed approach provides an MLaaS cloud service architecture that can be tuned by the end API user or consumer to outperform the conventional "one size fits all" approach.

* 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Via

Access Paper or Ask Questions

Domain Specific Approximation for Object Detection

Oct 04, 2018

Ting-Wu Chin, Chia-Lin Yu, Matthew Halpern, Hasan Genc, Shiao-Li Tsao, Vijay Janapa Reddi

Figure 1 for Domain Specific Approximation for Object Detection

Figure 2 for Domain Specific Approximation for Object Detection

Figure 3 for Domain Specific Approximation for Object Detection

Figure 4 for Domain Specific Approximation for Object Detection

Abstract:There is growing interest in object detection in advanced driver assistance systems and autonomous robots and vehicles. To enable such innovative systems, we need faster object detection. In this work, we investigate the trade-off between accuracy and speed with domain-specific approximations, i.e. category-aware image size scaling and proposals scaling, for two state-of-the-art deep learning-based object detection meta-architectures. We study the effectiveness of applying approximation both statically and dynamically to understand the potential and the applicability of them. By conducting experiments on the ImageNet VID dataset, we show that domain-specific approximation has great potential to improve the speed of the system without deteriorating the accuracy of object detectors, i.e. up to 7.5x speedup for dynamic domain-specific approximation. To this end, we present our insights toward harvesting domain-specific approximation as well as devise a proof-of-concept runtime, AutoFocus, that exploits dynamic domain-specific approximation.

* T. Chin, C. Yu, M. Halpern, H. Genc, S. Tsao and V. J. Reddi, "Domain-Specific Approximation for Object Detection," in IEEE Micro, vol. 38, no. 1, pp. 31-40, January/February 2018
* 6 pages, 6 figures. Published in IEEE Micro, vol. 38, no. 1, pp. 31-40, January/February 2018

Via

Access Paper or Ask Questions