Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Parker Hill

An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

Sep 04, 2019

Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K. Kummerfeld, Kevin Leach, Michael A. Laurenzano, Lingjia Tang(+1 more)

Figure 1 for An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

Figure 2 for An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

Figure 3 for An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

Figure 4 for An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction

Abstract:Task-oriented dialog systems need to know when a query falls outside their range of supported intents, but current text classification corpora only define label sets that cover every example. We introduce a new dataset that includes queries that are out-of-scope---i.e., queries that do not fall into any of the system's supported intents. This poses a new challenge because models cannot assume that every query at inference time belongs to a system-supported intent class. Our dataset also covers 150 intent classes over 10 domains, capturing the breadth that a production task-oriented agent must handle. We evaluate a range of benchmark classifiers on our dataset along with several different out-of-scope identification schemes. We find that while the classifiers perform well on in-scope intent classification, they struggle to identify out-of-scope queries. Our dataset and evaluation fill an important gap in the field, offering a way of more rigorously and realistically benchmarking text classification in task-driven dialog systems.

* Accepted to EMNLP-IJCNLP 2019

Via

Access Paper or Ask Questions

Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

Apr 05, 2019

Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K. Kummerfeld, Parker Hill, Michael A. Laurenzano, Johann Hauswald, Lingjia Tang, Jason Mars

Figure 1 for Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

Figure 2 for Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

Figure 3 for Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

Figure 4 for Outlier Detection for Improved Data Quality and Diversity in Dialog Systems

Abstract:In a corpus of data, outliers are either errors: mistakes in the data that are counterproductive, or are unique: informative samples that improve model robustness. Identifying outliers can lead to better datasets by (1) removing noise in datasets and (2) guiding collection of additional data to fill gaps. However, the problem of detecting both outlier types has received relatively little attention in NLP, particularly for dialog systems. We introduce a simple and effective technique for detecting both erroneous and unique samples in a corpus of short texts using neural sentence embeddings combined with distance-based outlier detection. We also present a novel data collection pipeline built atop our detection technique to automatically and iteratively mine unique data samples while discarding erroneous samples. Experiments show that our outlier detection technique is effective at finding errors while our data collection pipeline yields highly diverse corpora that in turn produce more robust intent classification and slot-filling models.

* Accepted as long paper to NAACL 2019

Via

Access Paper or Ask Questions

Rethinking Numerical Representations for Deep Neural Networks

Aug 07, 2018

Parker Hill, Babak Zamirai, Shengshuo Lu, Yu-Wei Chao, Michael Laurenzano, Mehrzad Samadi, Marios Papaefthymiou, Scott Mahlke, Thomas Wenisch, Jia Deng(+2 more)

Figure 1 for Rethinking Numerical Representations for Deep Neural Networks

Figure 2 for Rethinking Numerical Representations for Deep Neural Networks

Figure 3 for Rethinking Numerical Representations for Deep Neural Networks

Figure 4 for Rethinking Numerical Representations for Deep Neural Networks

Abstract:With ever-increasing computational demand for deep learning, it is critical to investigate the implications of the numeric representation and precision of DNN model weights and activations on computational efficiency. In this work, we explore unconventional narrow-precision floating-point representations as it relates to inference accuracy and efficiency to steer the improved design of future DNN platforms. We show that inference using these custom numeric representations on production-grade DNNs, including GoogLeNet and VGG, achieves an average speedup of 7.6x with less than 1% degradation in inference accuracy relative to a state-of-the-art baseline platform representing the most sophisticated hardware using single-precision floating point. To facilitate the use of such customized precision, we also present a novel technique that drastically reduces the time required to derive the optimal precision configuration.

Via

Access Paper or Ask Questions