ABSTRACT
Inductive Logic Programming (ILP) is a popular approach for learning in a relational environment. Given a set of positive and negative examples, an ILP system finds a logical description of the underlying data model that differentiates between the positive and negative examples. The key question becomes how to combine a set of rules to obtain a useful classifier. Previous work has shown that an effective approach is to treat each learned rule as an attribute in a propositional learner, and to use the classifier to determine the final label of an example [3]. This methodology defines a two step process. In the first step, an ILP algorithm learns a set of rules. In the second step, a classifier combines the learned rules. One weakness of this approach is that the rules learned in the first step are being evaluated by a different metric than how they are ultimately scored in the second step. ILP traditionally scores clauses through a coverage score or compression metric. Thus we have no guarantee that the rule learning process will select the rules that best contribute to the final classifier.We propose an alternative approach, based on the idea of constructing the classifier as we learn the rules [2, 4]. In our approach, rules are scored by how much they improve the classifier, providing a tight coupling between rule generation and rule usage. We call this novel methodology Score As You Use (SAYU) [2].In order to implement SAYU we defined an interface that allows an ILP algorithm to control a propositional learner. Second, we developed a greedy algorithm that uses the interface to decide whether to retain a candidate clause. We implemented this interface using Aleph to learn ILP rules, and Bayesian networks as the combining mechanism. We used two different Bayes net structure learning algorithms, Naïve Bayes and Tree Augmented Naïve Bayes (TAN) as propositional learners. We score the network by computing area under the precision recall curve for levels of recall greater than 0.2. Aleph proposes a candidate clause, which is introduced as a new feature in the training set. A new network topology is learned using the new training set, and then the new network is evaluated on a tuning set. If the score of the new network exceeds the previous score we retain the new rule in the training set. Otherwise the rule is discarded. The figure compares performance on the Breast Cancer dataset [1]. These results show that, given the same amount of CPU time, SAYU can clearly outperform the original two step approach. Furthermore, SAYU learns smaller theories. These results were obtained even though SAYU considers far fewer rules than standard ILP.
- J. Davis, E. Burnside, I. Dutra, D. Page, R. Ramakrishnan, V. Santos Costa, and J. Shavlik. View learning for statistical relational learning: With an application to mammography. In Proceedings of the 19th IJCAI, Edinburgh, Scotland, 2005. Google ScholarDigital Library
- J. Davis, E. Burnside, I. C. Dutra, D. Page, and V. Santos Costa. An integrated approach to learning bayesian networks of rules. In To appear in ECML05, Porto, Portugal, 2005. Google ScholarDigital Library
- J. Davis, V. Santos Costa, I. M. Ong, D. Page, and I. C. Dutra. Using Bayesian Classifiers to Combine Rules. In 3rd Workshop on Multi-Relational Data Mining, Seattle, USA, August 2004.Google Scholar
- N. Landwehr, K. Kersting, and L. D. Raedt. nFOIL: Integrating Naive Bayes and FOIL. In National Conference on Artificial Intelligene (AAAI), 2005. Google ScholarDigital Library
Recommendations
An integrated approach to learning bayesian networks of rules
ECML'05: Proceedings of the 16th European conference on Machine LearningInductive Logic Programming (ILP) is a popular approach for learning rules for classification tasks. An important question is how to combine the individual rules to obtain a useful classifier. In some instances, converting each learned rule into a ...
Lazy Learning of Bayesian Rules
The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree ...
Efficient Learning of Interpretable Classification Rules
Machine learning has become omnipresent with applications in various safety-critical domains such as medical, law, and transportation. In these domains, high-stake decisions provided by machine learning necessitate researchers to design interpretable ...
Comments