Abstract:We propose a formal definition for the task of suggestion mining in the context of a wide range of open domain applications. Human perception of the term \emph{suggestion} is subjective and this effects the preparation of hand labeled datasets for the task of suggestion mining. Existing work either lacks a formal problem definition and annotation procedure, or provides domain and application specific definitions. Moreover, many previously used manually labeled datasets remain proprietary. We first present an annotation study, and based on our observations propose a formal task definition and annotation procedure for creating benchmark datasets for suggestion mining. With this study, we also provide publicly available labeled datasets for suggestion mining in multiple domains.
Abstract:Mining suggestion expressing sentences from a given text is a less investigated sentence classification task, and therefore lacks hand labeled benchmark datasets. In this work, we propose and evaluate two approaches for distant supervision in suggestion mining. The distant supervision is obtained through a large silver standard dataset, constructed using the text from wikiHow and Wikipedia. Both the approaches use a LSTM based neural network architecture to learn a classification model for suggestion mining, but vary in their method to use the silver standard dataset. The first approach directly trains the classifier using this dataset, while the second approach only learns word embeddings from this dataset. In the second approach, we also learn POS embeddings, which interestingly gives the best classification accuracy.