Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets

Mar 03, 2021

Francisco Florez-Revuelta

Figure 1 for EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets

Figure 2 for EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets

Figure 3 for EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets

Figure 4 for EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets

Share this with someone who'll enjoy it:

Abstract:This paper presents a new evolutionary approach, EvoSplit, for the distribution of multi-label data sets into disjoint subsets for supervised machine learning. Currently, data set providers either divide a data set randomly or using iterative stratification, a method that aims to maintain the label (or label pair) distribution of the original data set into the different subsets. Following the same aim, this paper first introduces a single-objective evolutionary approach that tries to obtain a split that maximizes the similarity between those distributions independently. Second, a new multi-objective evolutionary algorithm is presented to maximize the similarity considering simultaneously both distributions (label and label pair). Both approaches are validated using well-known multi-label data sets as well as large image data sets currently used in computer vision and machine learning applications. EvoSplit improves the splitting of a data set in comparison to the iterative stratification following different measures: Label Distribution, Label Pair Distribution, Examples Distribution, folds and fold-label pairs with zero positive examples.

* This work has been submitted to a journal for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

View paper on

Share this with someone who'll enjoy it:

Title:EvoSplit: An evolutionary approach to split a multi-label data set into disjoint subsets

Paper and Code