Abstract
Classifiers chains (CC) is an effective approach in order to exploit label dependencies in multi-label data. However, it has the disadvantages that the chain is chosen at total random or relies on a pre-specified ordering of the labels which is expensive to compute. Moreover, the same ordering is used for every test instance, ignoring the fact that different orderings might be best suited for different test instances. We propose a new approach based on random decision trees (RDT) which can choose the label ordering for each prediction dynamically depending on the respective test instance. RDT are not adapted to a specific learning task, but in contrast allow to define a prediction objective on the fly during test time, thus offering a perfect test bed for directly comparing different prediction schemes. Indeed, we show that dynamically selecting the next label improves over using a static ordering of the labels under an otherwise unchanged RDT model and experimental environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We assume, w.l.o.g., that \(y_1, y_2, \ldots \) is the ordering of the predicted labels.
References
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Dembczyński, K., Cheng, W., Hüllermeier, E.: Bayes optimal multilabel classification via probabilistic classifier chains. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, pp. 279–286 (2010)
Dembczyński, K., Waegeman, W., Cheng, W., Hüllermeier, E.: On label dependence and loss minimization in multi-label classification. Mach. Learn. 88(1–2), 5–45 (2012)
Fan, W.: On the Optimality of probability estimation by random decision trees. In: Proceedings of the 19th National Conference on Artificial Intelligence, pp. 336–341 (2004)
Fan, W., Greengrass, E., McCloskey, J., Yu, P.S., Drammey, K.: Effective estimation of posterior probabilities: explaining the accuracy of randomized decision tree approaches. In: Proceedings of the 5th International Conference on Data Mining, pp. 154–161 (2005)
Fan, W., Wang, H., Yu, P.S., Ma, S.: Is random model better? On its accuracy and efficiency. In: Proceedings of the 3rd IEEE International Conference on Data Mining, pp. 51–58 (2003)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)
Goncalves, E.C., Plastino, A., Freitas, A.A.: A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In: Proceedings of the IEEE 25th International Conference on Tools with Artificial Intelligence, pp. 469–476 (2013)
Kong, X., Yu, P.S.: An ensemble-based approach to fast classification of multi-label data streams. In: Proceedings of the 7th International Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 95–104 (October 2011)
Kumar, A., Vembu, S., Menon, A.K., Elkan, C.: Beam search algorithms for multilabel learning. Mach. Learn. 92(1), 65–89 (2013)
Li, N., Zhou, Z.-H.: Selective ensemble of classifier chains. In: Zhou, Z.-H., Roli, F., Kittler, J. (eds.) MCS 2013. LNCS, vol. 7872, pp. 146–156. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38067-9_13
Malerba, D., Semeraro, G., Esposito, F.: A multistrategy approach to learning multiple dependent concepts. Mach. Learn. Stat. Interface chap. 4, 87–106 (1997)
Mena, D., Montañés, E., Quevedo, J.R., Coz, J.J.d.: Using A* for inference in probabilistic classifier chains. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 3707–3713 (2015)
Mena, D., Montañés, E., Quevedo, J.R., Coz, J.J.: An overview of inference methods in probabilistic classifier chains for multilabel classification. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 6(6), 215–230 (2016)
Nam, J., Loza Mencía, E., Kim, H.J., Fürnkranz, J.: Maximizing subset accuracy with recurrent neural networks in multi-label classification. In: Advances in Neural Information Processing Systems 30 (NIPS-17). pp. 5419–5429 (2017)
Quevedo, J.R., Luaces, O., Bahamonde, A.: Multilabel classifiers with a probabilistic thresholding strategy. Pattern Recognit. 45(2), 876–883 (2012)
Read, J., Martino, L., Luengo, D.: Efficient Monte Carlo methods for multi-dimensional learning with classifier chains. Pattern Recognit. 47(3), 1535–1546 (2014)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333–359 (2011)
Senge, R., del Coz, J.J., Hüllermeier, E.: On the problem of error propagation in classifier chains for multi-label classification. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds.) Data Analysis, Machine Learning and Knowledge Discovery. SCDAKO, pp. 163–170. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01595-8_18
da Silva, P.N., Gonçalves, E.C., Plastino, A., Freitas, A.A.: Distinct chains for different instances: an effective strategy for multi-label classifier chains. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8725, pp. 453–468. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44851-9_29
Sucar, L.E., Bielza, C., Morales, E.F., Hernandez-Leal, P., Zaragoza, J.H., Larrañaga, P.: Multi-label classification with Bayesian network-based chain classifiers. Pattern Recognit. Lett. 41, 14–22 (2014)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining Multi-label data. Data Mining and Knowledge Discovery Handbook, pp. 667–685 (2010)
Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., Vlahavas, I.: MULAN: a java library for multi-label learning. J. Mach. Learn. Res. 12, 2411–2414 (2011)
Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185 (2008)
Zhang, X., Fan, W., Du, N.: Random decision hashing for massive data learning. In: Proceedings of the 4th International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, pp. 65–80 (2015)
Zhang, X., Yuan, Q., Zhao, S., Fan, W., Zheng, W., Wang, Z.: Multi-label classification without the multi-label cost. In: Proceedings of the Society for Industrial and Applied Mathematics International Conference on Data Mining, pp. 778–789 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kulessa, M., Loza Mencía, E. (2018). Dynamic Classifier Chain with Random Decision Trees. In: Soldatova, L., Vanschoren, J., Papadopoulos, G., Ceci, M. (eds) Discovery Science. DS 2018. Lecture Notes in Computer Science(), vol 11198. Springer, Cham. https://doi.org/10.1007/978-3-030-01771-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-01771-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01770-5
Online ISBN: 978-3-030-01771-2
eBook Packages: Computer ScienceComputer Science (R0)