Abstract:Signaling proteins are an important topic in drug development due to the increased importance of finding fast, accurate and cheap methods to evaluate new molecular targets involved in specific diseases. The complexity of the protein structure hinders the direct association of the signaling activity with the molecular structure. Therefore, the proposed solution involves the use of protein star graphs for the peptide sequence information encoding into specific topological indices calculated with S2SNet tool. The Quantitative Structure-Activity Relationship classification model obtained with Machine Learning techniques is able to predict new signaling peptides. The best classification model is the first signaling prediction model, which is based on eleven descriptors and it was obtained using the Support Vector Machines - Recursive Feature Elimination (SVM-RFE) technique with the Laplacian kernel (RFE-LAP) and an AUROC of 0.961. Testing a set of 3114 proteins of unknown function from the PDB database assessed the prediction performance of the model. Important signaling pathways are presented for three UniprotIDs (34 PDBs) with a signaling prediction greater than 98.0%.
Abstract:Traditionally, signal classification is a process in which previous knowledge of the signals is needed. Human experts decide which features are extracted from the signals, and used as inputs to the classification system. This requirement can make significant unknown information of the signal be missed by the experts and not be included in the features. This paper proposes a new method that automatically analyses the signals and extracts the features without any human participation. Therefore, there is no need for previous knowledge about the signals to be classified. The proposed method is based on Genetic Programming and, in order to test this method, it has been applied to a well-known EEG database related to epilepsy, a disease suffered by millions of people. As the results section shows, high accuracies in classification are obtained
Abstract:Data mining and data classification over biomedical data are two of the most important research fields in computer science. Among the great diversity of techniques that can be used for this purpose, Artifical Neural Networks (ANNs) is one of the most suited. One of the main problems in the development of this technique is the slow performance of the full process. Traditionally, in this development process, human experts are needed to experiment with different architectural procedures until they find the one that presents the correct results for solving a specific problem. However, many studies have emerged in which different ANN developmental techniques, more or less automated, are described. In this paper, the authors have focused on developing a new technique to perform this process over biomedical data. The new technique is described in which two Evolutionary Computation (EC) techniques are mixed to automatically develop ANNs. These techniques are Genetic Algorithms and Genetic Programming. The work goes further, and the system described here allows to obtain simplified networks with a low number of neurons to resolve the problems. The system is compared with the already existent system which also uses EC over a set of well-known problems. The conclusions reached from these comparisons indicate that this new system produces very good results, which in the worst case are at least comparable to existing techniques and in many cases are substantially better.