skip to main content
10.1145/1774088.1774327acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Evolutionary model tree induction

Published:22 March 2010Publication History

ABSTRACT

Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output with an acceptable level of predictive performance. Since generating optimal model trees is a NP-Complete problem, the traditional model tree induction algorithms make use of a greedy heuristic, which may not converge to the global optimal solution. We propose the use of the evolutionary algorithms paradigm (EA) as an alternate heuristic to generate model trees in order to improve the convergence to global optimal solutions. We test the predictive performance of this new approach using public UCI datasets, and compare the results with traditional greedy regression/model trees induction algorithms.

References

  1. Asuncion, A. and Newman, D. UCI Machine Learning Repository. 2007.Google ScholarGoogle Scholar
  2. Basgalupp, M., Barros, Rodrigo C., Carvalho, A. P. L. F. de, Freitas, Alex A., and Ruiz, Duncan D. LEGAL-Tree: A lexicographic multi-objective genetic algorithm for decision tree induction. In Proceedings of the 24th Annual ACM Symposium on Applied Computing (Honolulu, Hawaii, USA 2009), 1085--1090. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Basgalupp, M., Barros, Rodrigo C., Carvalho, A. P. L. F. de, Freitas, Alex A., and Ruiz, Duncan D. Lexicographic multi-objective evolutionary induction of decision trees. International Journal of Bio-Inspired Computation, 1 (2009), 105--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bjorck, A. Numerical Methods for Least Square Problems. SIAM, 1996.Google ScholarGoogle Scholar
  5. Breiman, L. Bagging Predictors. Machine Learning, 24, 2 (1996), 123--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Breiman, L. Random Forests. Machine Learning, 45, 1 (2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. Classification and Regression Trees. Chapman & Hall, 1984.Google ScholarGoogle Scholar
  8. Burgess, C. J. and Lefley, M. Can genetic programming improve software effort estimation? A comparative evaluation. Information and Software Technology, 43, 14 (2001), 863--873.Google ScholarGoogle ScholarCross RefCross Ref
  9. Fan, G. and Gray, J. B. Regression tree analysis using target. Journal of Computational and Graphical Statistics, 14, 1 (2005), 206--218.Google ScholarGoogle ScholarCross RefCross Ref
  10. Freitas, Alex A. A critical review of multi-objective optimization in data mining: a position paper. SIGKDD Explorations Newsletter, 6, 2 (2004), 77--86. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Freitas, Alex A. A Review of Evolutionary Algorithms for Data Mining. In Soft Computing for Knowledge Discovery and Data Mining. Springer US, 2008, 79--111.Google ScholarGoogle Scholar
  12. Freitas, Alex A, Wieser, D. C., and Apweiler, R. On the importance of comprehensible classification models for protein function prediction. To appear in IEEE/ACM Transactions on Computational Biology and Bioinformatics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Freund, Y. and Schapire, R. E. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 1 (1997), 119--139. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Fu, Z., Golden, B. L., Lele, S., Raghavan, S., and Wasil, E. Diversification for better classification trees. Computers & Operations Research, 33, 11 (2006), 3185--3202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Goldberg, D. Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Boston, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Gray, J. B. and Fan, G. Classification tree analysis using TARGET. Computational Statistics & Data Analysis, 52 (2008), 1362--1372. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nadeau, C. and Bengio, Y. Inference for the generalization error. Machine Learning, 52, 3 (2003), 239--281. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Potgieter, G. and Engelbrecht, A. Evolving model trees for mining data sets with continuous-valued classes. Expert Systems with Applications, 35 (2008), 1513--1532. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Potgieter, G. and Engelbrecht, A. Genetic algorithms for the structural optimisation of learned polynomial expressions. Applied Mathematics and Computation, 186, 2 (2007), 1441--1466.Google ScholarGoogle ScholarCross RefCross Ref
  20. Quinlan, J. R. Learning with Continuous Classes. In 5th Australian Joint Conference on Artificial Intelligence (1992), 343--348.Google ScholarGoogle Scholar
  21. Rokach, L. and Maimon, O. Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Setiono, R., Leow, W. K., and Zurada, J. M. Extraction of rules from artificial neural networks for nonlinear regression. IEEE Transactions on Neural Networks, 13, 3 (2002), 564--577. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tan, P. N., Steinbach, M., and Kumar, V. Introduction to Data Mining. Pearson Education, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Wang, Y. and Witten, I. Inducing model trees for continuous classes. In Poster papers of the 9th European Conference on Machine Learning (1997).Google ScholarGoogle Scholar
  25. Witten, I. H. and Frank, E. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhao, H. A multi-objective genetic programming approach to developing Pareto optimal decision trees. Decision Support Systems, 43, 3 (2007), 809--826. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Evolutionary model tree induction

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SAC '10: Proceedings of the 2010 ACM Symposium on Applied Computing
        March 2010
        2712 pages
        ISBN:9781605586397
        DOI:10.1145/1774088

        Copyright © 2010 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 March 2010

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SAC '10 Paper Acceptance Rate364of1,353submissions,27%Overall Acceptance Rate1,650of6,669submissions,25%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader