Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dilip Singh Sisodia

Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

Apr 16, 2020

Deepak Singh, Dilip Singh Sisodia, Pradeep Singh

Figure 1 for Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

Figure 2 for Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

Figure 3 for Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

Figure 4 for Multi-Objective Evolutionary approach for the Performance Improvement of Learners using Ensembling Feature selection and Discretization Technique on Medical data

Abstract:Biomedical data is filled with continuous real values; these values in the feature set tend to create problems like underfitting, the curse of dimensionality and increase in misclassification rate because of higher variance. In response, pre-processing techniques on dataset minimizes the side effects and have shown success in maintaining the adequate accuracy. Feature selection and discretization are the two necessary preprocessing steps that were effectively employed to handle the data redundancies in the biomedical data. However, in the previous works, the absence of unified effort by integrating feature selection and discretization together in solving the data redundancy problem leads to the disjoint and fragmented field. This paper proposes a novel multi-objective based dimensionality reduction framework, which incorporates both discretization and feature reduction as an ensemble model for performing feature selection and discretization. Selection of optimal features and the categorization of discretized and non-discretized features from the feature subset is governed by the multi-objective genetic algorithm (NSGA-II). The two objective, minimizing the error rate during the feature selection and maximizing the information gain while discretization is considered as fitness criteria.

Via

Access Paper or Ask Questions

Clickbait Detection using Multiple Categorization Techniques

Mar 29, 2020

Abinash Pujahari, Dilip Singh Sisodia

Figure 1 for Clickbait Detection using Multiple Categorization Techniques

Figure 2 for Clickbait Detection using Multiple Categorization Techniques

Figure 3 for Clickbait Detection using Multiple Categorization Techniques

Figure 4 for Clickbait Detection using Multiple Categorization Techniques

Abstract:Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempted visitors to click on a particular link either to monetize the landing page or to spread the false news for sensationalization. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This paper proposes a hybrid categorization technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure, and clustering. During preliminary categorization, the headlines are separated using eleven features. After that, the headlines are recategorized using sentence formality, syntactic similarity measures. In the last phase, the headlines are again recategorized by applying clustering using word vector similarity based on t-Stochastic Neighbourhood Embedding (t-SNE) approach. After categorization of these headlines, machine learning models are applied to the data set to evaluate machine learning algorithms. The obtained experimental results indicate the proposed hybrid model is more robust, reliable and efficient than any individual categorization techniques for the real-world dataset we used.

* 2019
* 11 pages, 7 figures, 4 tables to be published in Journal of Information Science

Via

Access Paper or Ask Questions