Abstract:We propose a Similarity-Based Stratified Splitting (SBSS) technique, which uses both the output and input space information to split the data. The splits are generated using similarity functions among samples to place similar samples in different splits. This approach allows for a better representation of the data in the training phase. This strategy leads to a more realistic performance estimation when used in real-world applications. We evaluate our proposal in twenty-two benchmark datasets with classifiers such as Multi-Layer Perceptron, Support Vector Machine, Random Forest and K-Nearest Neighbors, and five similarity functions Cityblock, Chebyshev, Cosine, Correlation, and Euclidean. According to the Wilcoxon Sign-Rank test, our approach consistently outperformed ordinary stratified 10-fold cross-validation in 75\% of the assessed scenarios.
Abstract:Clustering analysis has become a ubiquitous information retrieval tool in a wide range of domains, but a more automatic framework is still lacking. Though internal metrics are the key players towards a successful retrieval of clusters, their effectiveness on real-world datasets remains not fully understood, mainly because of their unrealistic assumptions underlying datasets. We hypothesized that capturing {\it traces of information gain} between increasingly complex clustering retrievals---{\it InfoGuide}---enables an automatic clustering analysis with improved clustering retrievals. We validated the {\it InfoGuide} hypothesis by capturing the traces of information gain using the Kolmogorov-Smirnov statistic and comparing the clusters retrieved by {\it InfoGuide} against those retrieved by other commonly used internal metrics in artificially-generated, benchmarks, and real-world datasets. Our results suggested that {\it InfoGuide} can enable a more automatic clustering analysis and may be more suitable for retrieving clusters in real-world datasets displaying nontrivial statistical properties.
Abstract:Computational swarm intelligence consists of multiple artificial simple agents exchanging information while exploring a search space. Despite a rich literature in the field, with works improving old approaches and proposing new ones, the mechanism by which complex behavior emerges in these systems is still not well understood. This literature gap hinders the researchers' ability to deal with known problems in swarms intelligence such as premature convergence, and the balance of coordination and diversity among agents. Recent advances in the literature, however, have proposed to study these systems via the network that emerges from the social interactions within the swarm (i.e., the interaction network). In our work, we propose a definition of the interaction network for the Artificial Bee Colony (ABC) algorithm. With our approach, we captured striking idiosyncrasies of the algorithm. We uncovered the different patterns of social interactions that emerge from each type of bee, revealing the importance of the bees variations throughout the iterations of the algorithm. We found that ABC exhibits a dynamic information flow through the use of different bees but lacks continuous coordination between the agents.
Abstract:Self-organization is a natural phenomenon that emerges in systems with a large number of interacting components. Self-organized systems show robustness, scalability, and flexibility, which are essential properties when handling real-world problems. Swarm intelligence seeks to design nature-inspired algorithms with a high degree of self-organization. Yet, we do not know why swarm-based algorithms work well and neither we can compare the different approaches in the literature. The lack of a common framework capable of characterizing these several swarm-based algorithms, transcending their particularities, has led to a stream of publications inspired by different aspects of nature without much regard as to whether they are similar to already existing approaches. We address this gap by introducing a network-based framework$-$the interaction network$-$to examine computational swarm-based systems via the optics of social dynamics. We discuss the social dimension of several swarm classes and provide a case study of the Particle Swarm Optimization. The interaction network enables a better understanding of the plethora of approaches currently available by looking at them from a general perspective focusing on the structure of the social interactions.