Abstract:Decision tree induction systems are being used for knowledge acquisition in noisy domains. This paper develops a subjective Bayesian interpretation of the task tackled by these systems and the heuristic methods they use. It is argued that decision tree systems implicitly incorporate a prior belief that the simpler (in terms of decision tree complexity) of two hypotheses be preferred, all else being equal, and that they perform a greedy search of the space of decision rules to find one in which there is strong posterior belief. A number of improvements to these systems are then suggested.
Abstract:Theory refinement is the task of updating a domain theory in the light of new cases, to be done automatically or with some expert assistance. The problem of theory refinement under uncertainty is reviewed here in the context of Bayesian statistics, a theory of belief revision. The problem is reduced to an incremental learning task as follows: the learning system is initially primed with a partial theory supplied by a domain expert, and thereafter maintains its own internal representation of alternative theories which is able to be interrogated by the domain expert and able to be incrementally refined from data. Algorithms for refinement of Bayesian networks are presented to illustrate what is meant by "partial theory", "alternative theory representation", etc. The algorithms are an incremental variant of batch learning algorithms from the literature so can work well in batch and incremental mode.
Abstract:This paper presents a plausible reasoning system to illustrate some broad issues in knowledge representation: dualities between different reasoning forms, the difficulty of unifying complementary reasoning styles, and the approximate nature of plausible reasoning. These issues have a common underlying theme: there should be an underlying belief calculus of which the many different reasoning forms are special cases, sometimes approximate. The system presented allows reasoning about defaults, likelihood, necessity and possibility in a manner similar to the earlier work of Adams. The system is based on the belief calculus of subjective Bayesian probability which itself is based on a few simple assumptions about how belief should be manipulated. Approximations, semantics, consistency and consequence results are presented for the system. While this puts these often discussed plausible reasoning forms on a probabilistic footing, useful application to practical problems remains an issue.
Abstract:Chain graphs combine directed and undirected graphs and their underlying mathematics combines properties of the two. This paper gives a simplified definition of chain graphs based on a hierarchical combination of Bayesian (directed) and Markov (undirected) networks. Examples of a chain graph are multivariate feed-forward networks, clustering with conditional interaction between variables, and forms of Bayes classifiers. Chain graphs are then extended using the notation of plates so that samples and data analysis problems can be represented in a graphical model as well. Implications for learning are discussed in the conclusion.
Abstract:Methods for analysis of principal components in discrete data have existed for some time under various names such as grade of membership modelling, probabilistic latent semantic analysis, and genotype inference with admixture. In this paper we explore a number of extensions to the common theory, and present some application of these methods to some common statistical tasks. We show that these methods can be interpreted as a discrete version of ICA. We develop a hierarchical version yielding components at different levels of detail, and additional techniques for Gibbs sampling. We compare the algorithms on a text prediction task using support vector machines, and to information retrieval.