Abstract:Social media has been remarkably grown during the past few years. Nowadays, posting messages on social media websites has become one of the most popular Internet activities. The vast amount of user-generated content has made social media the most extensive data source of public opinion. Sentiment analysis is one of the techniques used to analyze user-generated data. The Persian language has specific features and thereby requires unique methods and models to be adopted for sentiment analysis, which are different from those in English language. Sentiment analysis in each language has specified prerequisites; hence, the direct use of methods, tools, and resources developed for English language in Persian has its limitations. The main target of this paper is to provide a comprehensive literature survey for state-of-the-art advances in Persian sentiment analysis. In this regard, the present study aims to investigate and compare the previous sentiment analysis studies on Persian texts and describe contributions presented in articles published in the last decade. First, the levels, approaches, and tasks for sentiment analysis are described. Then, a detailed survey of the sentiment analysis methods used for Persian texts is presented, and previous relevant works on Persian Language are discussed. Moreover, we present in this survey the authentic and published standard sentiment analysis resources and advances that have been done for Persian sentiment analysis. Finally, according to the state-of-the-art development of English sentiment analysis, some issues and challenges not being addressed in Persian texts are listed, and some guidelines and trends are provided for future research on Persian texts. The paper provides information to help new or established researchers in the field as well as industry developers who aim to deploy an operational complete sentiment analysis system.
Abstract:With the widespread dissemination of user-generated content on different social networks, and online consumer systems such as Amazon, the quantity of opinionated information available on the Internet has been increased. One of the main tasks of the sentiment analysis is to detect polarity within a text. The existing polarity detection methods mainly focus on keywords and their naive frequency counts; however, they less regard the meanings and implicit dimensions of the natural concepts. Although background knowledge plays a critical role in determining the polarity of concepts, it has been disregarded in polarity detection methods. This study presents a context-based model to solve ambiguous polarity concepts using commonsense knowledge. First, a model is presented to generate a source of ambiguous sentiment concepts based on SenticNet by computing the probability distribution. Then the model uses a bag-of-concepts approach to remove ambiguities and semantic augmentation with the ConceptNet handling to overcome lost knowledge. ConceptNet is a large-scale semantic network with a large number of commonsense concepts. In this paper, the point mutual information (PMI) measure is used to select the contextual concepts having strong relationships with ambiguous concepts. The polarity of the ambiguous concepts is precisely detected using positive/negative contextual concepts and the relationship of the concepts in the semantic knowledge base. The text representation scheme is semantically enriched using Numberbatch, which is a word embedding model based on the concepts from the ConceptNet semantic network. The proposed model is evaluated by applying a corpus of product reviews, called Semeval. The experimental results revealed an accuracy rate of 82.07%, representing the effectiveness of the proposed model.