Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aftab Anjum

Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

Jun 01, 2020

Aftab Anjum, Mazharul Islam, Lin Wang

Figure 1 for Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

Figure 2 for Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

Figure 3 for Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

Figure 4 for Dimensionality Reduction for Sentiment Classification: Evolving for the Most Prominent and Separable Features

Abstract:In sentiment classification, the enormous amount of textual data, its immense dimensionality, and inherent noise make it extremely difficult for machine learning classifiers to extract high-level and complex abstractions. In order to make the data less sparse and more statistically significant, the dimensionality reduction techniques are needed. But in the existing dimensionality reduction techniques, the number of components needs to be set manually which results in loss of the most prominent features, thus reducing the performance of the classifiers. Our prior work, i.e., Term Presence Count (TPC) and Term Presence Ratio (TPR) have proven to be effective techniques as they reject the less separable features. However, the most prominent and separable features might still get removed from the initial feature set despite having higher distributions among positive and negative tagged documents. To overcome this problem, we have proposed a new framework that consists of two-dimensionality reduction techniques i.e., Sentiment Term Presence Count (SentiTPC) and Sentiment Term Presence Ratio (SentiTPR). These techniques reject the features by considering term presence difference for SentiTPC and ratio of the distribution distinction for SentiTPR. Additionally, these methods also analyze the total distribution information. Extensive experimental results exhibit that the proposed framework reduces the feature dimension by a large scale, and thus significantly improve the classification performance.

* Pages 1-14

Via

Access Paper or Ask Questions

A Novel Continuous Representation of Genetic Programmings using Recurrent Neural Networks for Symbolic Regression

Apr 06, 2019

Aftab Anjum, Fengyang Sun, Lin Wang, Jeff Orchard

Figure 1 for A Novel Continuous Representation of Genetic Programmings using Recurrent Neural Networks for Symbolic Regression

Figure 2 for A Novel Continuous Representation of Genetic Programmings using Recurrent Neural Networks for Symbolic Regression

Figure 3 for A Novel Continuous Representation of Genetic Programmings using Recurrent Neural Networks for Symbolic Regression

Figure 4 for A Novel Continuous Representation of Genetic Programmings using Recurrent Neural Networks for Symbolic Regression

Abstract:Neuro-encoded expression programming that aims to offer a novel continuous representation of combinatorial encoding for genetic programming methods is proposed in this paper. Genetic programming with linear representation uses nature-inspired operators to tune expressions and finally search out the best explicit function to simulate data. The encoding mechanism is essential for genetic programmings to find a desirable solution efficiently. However, the linear representation methods manipulate the expression tree in discrete solution space, where a small change of the input can cause a large change of the output. The unsmooth landscapes destroy the local information and make difficulty in searching. The neuro-encoded expression programming constructs the gene string with recurrent neural network (RNN) and the weights of the network are optimized by powerful continuous evolutionary algorithms. The neural network mappings smoothen the sharp fitness landscape and provide rich neighborhood information to find the best expression. The experiments indicate that the novel approach improves test accuracy and efficiency on several well-known symbolic regression problems.

* 12 pages, 4 figures, 2 tables, will be submitted to ICANN 2019, 28th International Conference on Artificial Neural Networks, held in September 17-19, 2019, Munich, Germany

Via

Access Paper or Ask Questions