Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyeongmin Byun

Towards Real-Time Automatic Portrait Matting on Mobile Devices

Apr 08, 2019

Seokjun Seo, Seungwoo Choi, Martin Kersner, Beomjun Shin, Hyungsuk Yoon, Hyeongmin Byun, Sungjoo Ha

Figure 1 for Towards Real-Time Automatic Portrait Matting on Mobile Devices

Figure 2 for Towards Real-Time Automatic Portrait Matting on Mobile Devices

Figure 3 for Towards Real-Time Automatic Portrait Matting on Mobile Devices

Figure 4 for Towards Real-Time Automatic Portrait Matting on Mobile Devices

Abstract:We tackle the problem of automatic portrait matting on mobile devices. The proposed model is aimed at attaining real-time inference on mobile devices with minimal degradation of model performance. Our model MMNet, based on multi-branch dilated convolution with linear bottleneck blocks, outperforms the state-of-the-art model and is orders of magnitude faster. The model can be accelerated four times to attain 30 FPS on Xiaomi Mi 5 device with moderate increase in the gradient error. Under the same conditions, our model has an order of magnitude less number of parameters and is faster than Mobile DeepLabv3 while maintaining comparable performance. The accompanied implementation can be found at \url{https://github.com/hyperconnect/MMNet}.

Via

Access Paper or Ask Questions

Temporal Convolution for Real-time Keyword Spotting on Mobile Devices

Apr 08, 2019

Seungwoo Choi, Seokjun Seo, Beomjun Shin, Hyeongmin Byun, Martin Kersner, Beomsu Kim, Dongyoung Kim, Sungjoo Ha

Figure 1 for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices

Figure 2 for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices

Figure 3 for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices

Figure 4 for Temporal Convolution for Real-time Keyword Spotting on Mobile Devices

Abstract:Keyword spotting (KWS) plays a critical role in enabling speech-based user interactions on smart devices. Recent developments in the field of deep learning have led to wide adoption of convolutional neural networks (CNNs) in KWS systems due to their exceptional accuracy and robustness. The main challenge faced by KWS systems is the trade-off between high accuracy and low latency. Unfortunately, there has been little quantitative analysis of the actual latency of KWS models on mobile devices. This is especially concerning since conventional convolution-based KWS approaches are known to require a large number of operations to attain an adequate level of performance. In this paper, we propose a temporal convolution for real-time KWS on mobile devices. Unlike most of the 2D convolution-based KWS approaches that require a deep architecture to fully capture both low- and high-frequency domains, we exploit temporal convolutions with a compact ResNet architecture. In Google Speech Command Dataset, we achieve more than \textbf{385x} speedup on Google Pixel 1 and surpass the accuracy compared to the state-of-the-art model. In addition, we release the implementation of the proposed and the baseline models including an end-to-end pipeline for training models and evaluating them on mobile devices.

* Submitted to INTERSPEECH 2019

Via

Access Paper or Ask Questions