Abstract:Learning classifier systems (LCSs) originated from cognitive-science research but migrated such that LCS became powerful classification techniques. Modern LCSs can be used to extract building blocks of knowledge to solve more difficult problems in the same or a related domain. Recent works on LCSs showed that the knowledge reuse through the adoption of Code Fragments, GP-like tree-based programs, into LCSs could provide advances in scaling. However, since solving hard problems often requires constructing high-level building blocks, which also results in an intractable search space, a limit of scaling will eventually be reached. Inspired by human problem-solving abilities, XCSCF* can reuse learned knowledge and learned functionality to scale to complex problems by transferring them from simpler problems using layered learning. However, this method was unrefined and suited to only the Multiplexer problem domain. In this paper, we propose improvements to XCSCF* to enable it to be robust across multiple problem domains. This is demonstrated on the benchmarks Multiplexer, Carry-one, Majority-on, and Even-parity domains. The required base axioms necessary for learning are proposed, methods for transfer learning in LCSs developed and learning recast as a decomposition into a series of subordinate problems. Results show that from a conventional tabula rasa, with only a vague notion of what subordinate problems might be relevant, it is possible to capture the general logic behind the tested domains, so the advanced system is capable of solving any individual n-bit Multiplexer, n-bit Carry-one, n-bit Majority-on, or n-bit Even-parity problem.
Abstract:Multitask Learning is a learning paradigm that deals with multiple different tasks in parallel and transfers knowledge among them. XOF, a Learning Classifier System using tree-based programs to encode building blocks (meta-features), constructs and collects features with rich discriminative information for classification tasks in an observed list. This paper seeks to facilitate the automation of feature transferring in between tasks by utilising the observed list. We hypothesise that the best discriminative features of a classification task carry its characteristics. Therefore, the relatedness between any two tasks can be estimated by comparing their most appropriate patterns. We propose a multiple-XOF system, called mXOF, that can dynamically adapt feature transfer among XOFs. This system utilises the observed list to estimate the task relatedness. This method enables the automation of transferring features. In terms of knowledge discovery, the resemblance estimation provides insightful relations among multiple data. We experimented mXOF on various scenarios, e.g. representative Hierarchical Boolean problems, classification of distinct classes in the UCI Zoo dataset, and unrelated tasks, to validate its abilities of automatic knowledge-transfer and estimating task relatedness. Results show that mXOF can estimate the relatedness reasonably between multiple tasks to aid the learning performance with the dynamic feature transferring.
Abstract:A major goal of machine learning is to create techniques that abstract away irrelevant information. The generalisation property of standard Learning Classifier System (LCS) removes such information at the feature level but not at the feature interaction level. Code Fragments (CFs), a form of tree-based programs, introduced feature manipulation to discover important interactions, but they often contain irrelevant information, which causes structural inefficiency. XOF is a recently introduced LCS that uses CFs to encode building blocks of knowledge about feature interaction. This paper aims to optimise the structural efficiency of CFs in XOF. We propose two measures to improve constructing CFs to achieve this goal. Firstly, a new CF-fitness update estimates the applicability of CFs that also considers the structural complexity. The second measure we can use is a niche-based method of generating CFs. These approaches were tested on Even-parity and Hierarchical problems, which require highly complex combinations of input features to capture the data patterns. The results show that the proposed methods significantly increase the structural efficiency of CFs, which is estimated by the rule "generality rate". This results in faster learning performance in the Hierarchical Majority-on problem. Furthermore, a user-set depth limit for CF generation is not needed as the learning agent will not adopt higher-level CFs once optimal CFs are constructed.