Abstract:This paper focuses on the relation between computational learning theory and resource-bounded dimension. We intend to establish close connections between the learnability/nonlearnability of a concept class and its corresponding size in terms of effective dimension, which will allow the use of powerful dimension techniques in computational learning and viceversa, the import of learning results into complexity via dimension. Firstly, we obtain a tight result on the dimension of online mistake-bound learnable classes. Secondly, in relation with PAC learning, we show that the polynomial-space dimension of PAC learnable classes of concepts is zero. This provides a hypothesis on effective dimension that implies the inherent unpredictability of concept classes (the classes that verify this property are classes not efficiently PAC learnable using any hypothesis). Thirdly, in relation to space dimension of classes that are learnable by membership query algorithms, the main result proves that polynomial-space dimension of concept classes learnable by a membership-query algorithm is zero.
Abstract:One of the core applications of machine learning to knowledge discovery consists on building a function (a hypothesis) from a given amount of data (for instance a decision tree or a neural network) such that we can use it afterwards to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is very simple, and thus, the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is almost the best one. We present two on-line sampling algorithms for selecting hypotheses, give theoretical bounds on the number of necessary examples, and analize them exprimentally. We compare them with the simple batch sampling approach commonly used and show that in most of the situations our algorithms use much fewer number of examples.