Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Han Suk Shim

QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer

Jun 23, 2022

Jinmiao Huang, Waseem Gharbieh, Qianhui Wan, Han Suk Shim, Chul Lee

Figure 1 for QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer

Figure 2 for QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer

Figure 3 for QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer

Figure 4 for QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer

Abstract:Current keyword spotting systems are typically trained with a large amount of pre-defined keywords. Recognizing keywords in an open-vocabulary setting is essential for personalizing smart device interaction. Towards this goal, we propose a pure MLP-based neural network that is based on MLPMixer - an MLP model architecture that effectively replaces the attention mechanism in Vision Transformers. We investigate different ways of adapting the MLPMixer architecture to the QbyE open-vocabulary keyword spotting task. Comparisons with the state-of-the-art RNN and CNN models show that our method achieves better performance in challenging situations (10dB and 6dB environments) on both the publicly available Hey-Snips dataset and a larger scale internal dataset with 400 speakers. Our proposed model also has a smaller number of parameters and MACs compared to the baseline models.

* Accepted to INTERSPEECH 2022

Via

Access Paper or Ask Questions

Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

Feb 14, 2021

Jinmiao Huang, Waseem Gharbieh, Han Suk Shim, Eugene Kim

Figure 1 for Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

Figure 2 for Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

Figure 3 for Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

Figure 4 for Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

Abstract:This paper proposes a neural network architecture for tackling the query-by-example user-defined keyword spotting task. A multi-head attention module is added on top of a multi-layered GRU for effective feature extraction, and a normalized multi-head attention module is proposed for feature aggregation. We also adopt the softtriple loss - a combination of triplet loss and softmax loss - and showcase its effectiveness. We demonstrate the performance of our model on internal datasets with different languages and the public Hey-Snips dataset. We compare the performance of our model to a baseline system and conduct an ablation study to show the benefit of each component in our architecture. The proposed work shows solid performance while preserving simplicity.

* Published at ICASSP 2021

Via

Access Paper or Ask Questions

Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Mar 31, 2016

Hyunsuk Ko, Han Suk Shim, Ouk Choi, C. -C. Jay Kuo

Figure 1 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Figure 2 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Figure 3 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Figure 4 for Robust Uncalibrated Stereo Rectification with Constrained Geometric Distortions (USR-CGD)

Abstract:A novel algorithm for uncalibrated stereo image-pair rectification under the constraint of geometric distortion, called USR-CGD, is presented in this work. Although it is straightforward to define a rectifying transformation (or homography) given the epipolar geometry, many existing algorithms have unwanted geometric distortions as a side effect. To obtain rectified images with reduced geometric distortions while maintaining a small rectification error, we parameterize the homography by considering the influence of various kinds of geometric distortions. Next, we define several geometric measures and incorporate them into a new cost function for parameter optimization. Finally, we propose a constrained adaptive optimization scheme to allow a balanced performance between the rectification error and the geometric error. Extensive experimental results are provided to demonstrate the superb performance of the proposed USR-CGD method, which outperforms existing algorithms by a significant margin.

Via

Access Paper or Ask Questions