Abstract
In classification problems, it is very common to have ordinal data in the variables, in both explanatory and class variables. When the class variable should increase according to a subset of explanatory variables, the problem must satisfy the monotonicity constraint. It is well known that standard classification tree algorithms, such as CART or C4.5, are not guaranteed to produce monotonic trees, even if the data set is completely monotone. Recently, some classifiers have been designed to handle these kinds of problems. In decision trees, growing and pruning mechanisms have been updated to improve the monotonicity of the trees. In this paper, we study the suitability of using these mechanisms in the generation of Random Forests. For this, we propose a simple ensemble pruning mechanism based on the degree of monotonicity of the resulting trees. The performance of several decision trees are evaluated through experimental studies on monotonic data sets. We deduce that the trees produced by the Random Forest also hold the monotonicity restriction but achieve a slightly better predictive performance than standard algorithms.
Similar content being viewed by others
References
Ben-David A., Sterling L., Pao Y. H.: “Learning and classification of monotonic ordinal concepts”. Computational Intelligence, 5, 45–49 (1989)
Kotlowski, W. and Slowiński, R., “On nonparametric ordinal classification with monotonicity constraints,” IEEE Transactions on Knowledge Data Engineering, 25, 11, pp. 2576–2589, 2013.
Chen, C.-C. and Li, S.-T., “Credit rating with a monotonicity-constrained support vector machine model,” Expert Systems with Applications, 41, 16, pp. 7235–7247, 2014.
Breiman, L., Friedman, J., Olshen, R. A. and Stone, C. J., Classification and Regression Trees, Chapman & Hall, 1984.
Quinlan, J. R., C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc., 1993.
Rokach, L. and Maimon, O., Data Mining with Decision Trees Theory and Applications, 2nd Edition, World Scientific, 2014.
Furnkranz, J., Gamberger, D. and Lavrac. N., Foundations of Rule Learning, Springer, 2012.
Wozniak M., Graña M., Corchado E.: “A survey of multiple classifier systems as hybrid systems”. Information Fusion 16, 3–17 (2014)
Zhang Y., Burer S., Street W. N.: “Ensemble pruning via semi-definite programming”. Journal of Machine Learning Research 7, 1315–1338 (2006)
Martínez-Muñoz, G., Hernández-Lobato, D., and Suárez, A., “An analysis of ensemble pruning techniques based on ordered aggregation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 2, pp. 245–259, 2009.
Breiman L.: “Bagging predictors”. Machine Learning, 24(2), 123–140 (1996)
Breiman L.: “Random forests”. Machine Learning, 45(1), 5–32 (2001)
Ben-David A.: “Monotonicity maintenance in information-theoretic machine learning algorithms”. Machine Learning, 19(1), 29–43 (1995)
Xia F., Zhang W., Li F., Yang Y.: “Ranking with decision tree”. Knowledge and Information Systems 17(3), 381–395 (2008)
Ben-David A.: “Automatic generation of symbolic multiattribute ordinal knowledge-based dsss: methodology and applications”. Decision Sciences 23, 1357–1372 (1992)
Lievens S., De Baets B., Cao-Van K.: “A probabilistic framework for the design of instance-based supervised ranking algorithms in an ordinal setting”. Annals of Operations Research 163(1), 115–142 (2008)
Duivesteijn, W. and Feelders, A., “Nearest neighbour classification with monotonicity constraints,” Machine Learning and Knowledge Discovery in Databases, ECML/PKDD, 1, pp. 301–316, 2008.
Demšar J.: “Statistical comparisons of classifiers over multiple data sets”. Journal of Machine Learning Research 7, 1–30 (2006)
García S., Herrera F.: “An extension on ‘statistical comparisons of classifiers over multiple data sets’ for all pairwise comparisons”. Journal of Machine Learning Research 9, 2677–2694 (2008)
García S., Fernández A., Luengo J., Herrera F.: “Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power”. Information Sciences 180(10), 2044–2064 (2010)
Ben-David A., Sterling L., Tran T.: “Adding monotonicity to learning algorithms may impair their accuracy”. Expert Systems with Applications 36(3), 6627–6634 (2009)
García, S., Luengo, J. and Herrera, F., Data Preprocessing in Data Mining, Springer, 2015.
Potharst R., Ben-David A., van Wezel M. C.: “Two algorithms for generating structured and unstructured monotone ordinal datasets”. Engineering Applications of Artificial Intelligence 22(4-5), 491–496 (2009)
Potharst R., Feelders A. J.: “Classification trees for problems with monotonicity constraints”. SIGKDD Explorations 4(1), 1–10 (2002)
Cao-Van K., De Baets B.: “Growing decision trees in an ordinal setting”. International Journal of Intelligent Systems 18(7), 733–750 (2003)
Hu Q., Che X., Lei Z., Zhang D., Guo M., Yu D.: “Rank entropy-based decision trees for monotonic classification”. IEEE Transactions on Knowledge and Data Engineering 24(11), 2052–2064 (2012)
Czarnowski I., Jȩdrzejowicz P.: “Designing RBF networks using the agentbased population learning algorithm”. New Generation Computing 32(3-4), 331–351 (2014)
Daniels H., Velikova M.: “Monotone and partially monotone neural networks”. IEEE Transactions on Neural Networks 21(6), 906–917 (2010)
Fernández-Navarro F., Riccardi A., Carloni S.: “Ordinal neural networks without iterative tuning”. IEEE Transactions on Neural Networks and Learning Systems 25(11), 2075–2085 (2014)
Makino, K., Suda, T., Ono, H., Ibaraki, T.: “Data analysis by positive decision trees”. IEICE Transactions on Information and Systems, E82-D, 1, pp. 76–88, 1999.
Potharst R., Bioch J. C.: “Decision trees for ordinal classification”. Intelligent Data Analysis 4(2), 97–111 (2000)
Feelders, A. and Pardoel, M., “Pruning for monotone classification trees,” IDA, Lecture Notes in Computer Science, 2810, Springer, pp. 1–12, 2003.
Lee, J. W. T., Yeung, D. S. and Wang, X., “Monotonic decision tree for ordinal classification, IEEE International Conference on Systems, Man and Cybernetics, pp. 2623–2628, 2003.
Zhu, H., Zhai, J., Wang, S. and Wang, X., “Monotonic decision tree for interval valued data,” Machine Learning and Cybernetics - 13th International Conference, pp. 231–240, 2014.
Marsala C., Petturiti D.: “Rank discrimination measures for enforcing monotonicity in decision tree induction”. Information Sciences 291, 143–171 (2015)
Bache, K. and Lichman, M., UCI machine learning repository, 2013.
Alcala-Fdez, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L. and Herrera, F., “KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework,” Journal of Multiple-Valued Logic and Soft Computing, 17, 2-3, pp. 255–287, 2011.
Prati, R. C., Batista, G. E. A. P. A. and Monard, M. C., “A survey on graphical methods for classification predictive performance evaluation,” IEEE Transactions on Knowledge and Data Engineering, 23, 11 pp. 1601–1618, 2011.
Gaudette, L. and Japkowicz, N., “Evaluation methods for ordinal classification,” in Canadian Conference on AI, Lecture Notes in Computer Science, 5549, pp. 207–210, 2009.
Japkowicz, N. and Shah, M. ed., Evaluating Learning Algorithms: A Classification Perspective, Cambridge University Press, 2011.
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
González, S., Herrera, F. & García, S. Monotonic Random Forest with an Ensemble Pruning Mechanism based on the Degree of Monotonicity. New Gener. Comput. 33, 367–388 (2015). https://doi.org/10.1007/s00354-015-0402-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-015-0402-4