Abstract:We present a catalogue of about 6 million unresolved photometric detections in the Sloan Digital Sky Survey Seventh Data Release classifying them into stars, galaxies and quasars. We use a machine learning classifier trained on a subset of spectroscopically confirmed objects from 14th to 22nd magnitude in the SDSS {\it i}-band. Our catalogue consists of 2,430,625 quasars, 3,544,036 stars and 63,586 unresolved galaxies from 14th to 24th magnitude in the SDSS {\it i}-band. Our algorithm recovers 99.96% of spectroscopically confirmed quasars and 99.51% of stars to i $\sim$21.3 in the colour window that we study. The level of contamination due to data artefacts for objects beyond $i=21.3$ is highly uncertain and all mention of completeness and contamination in the paper are valid only for objects brighter than this magnitude. However, a comparison of the predicted number of quasars with the theoretical number counts shows reasonable agreement.
Abstract:A learning algorithm based on primary school teaching and learning is presented. The methodology is to continuously evaluate a student and to give them training on the examples for which they repeatedly fail, until, they can correctly answer all types of questions. This incremental learning procedure produces better learning curves by demanding the student to optimally dedicate their learning time on the failed examples. When used in machine learning, the algorithm is found to train a machine on a data with maximum variance in the feature space so that the generalization ability of the network improves. The algorithm has interesting applications in data mining, model evaluations and rare objects discovery.
Abstract:Rainfall in Kerala State, the southern part of Indian Peninsula in particular is caused by the two monsoons and the two cyclones every year. In general, climate and rainfall are highly nonlinear phenomena in nature giving rise to what is known as the `butterfly effect'. We however attempt to train an ABF neural network on the time series rainfall data and show for the first time that in spite of the fluctuations resulting from the nonlinearity in the system, the trends in the rainfall pattern in this corner of the globe have remained unaffected over the past 87 years from 1893 to 1980. We also successfully filter out the chaotic part of the system and illustrate that its effects are marginal over long term predictions.
Abstract:The difference-boosting algorithm is used on letters dataset from the UCI repository to classify distorted raster images of English alphabets. In contrast to rather complex networks, the difference-boosting is found to produce comparable or better classification efficiency on this complex problem.
Abstract:A Bayesian classifier that up-weights the differences in the attribute values is discussed. Using four popular datasets from the UCI repository, some interesting features of the network are illustrated. The network is suitable for classification problems.