ABSTRACT
In multi-label learning, each training example is associated with a set of labels and the task is to predict the proper label set for the unseen example. Due to the tremendous (exponential) number of possible label sets, the task of learning from multi-label examples is rather challenging. Therefore, the key to successful multi-label learning is how to effectively exploit correlations between different labels to facilitate the learning process. In this paper, we propose to use a Bayesian network structure to efficiently encode the conditional dependencies of the labels as well as the feature set, with the feature set as the common parent of all labels. To make it practical, we give an approximate yet efficient procedure to find such a network structure. With the help of this network, multi-label learning is decomposed into a series of single-label classification problems, where a classifier is constructed for each label by incorporating its parental labels as additional features. Label sets of unseen examples are predicted recursively according to the label ordering given by the network. Extensive experiments on a broad range of data sets validate the effectiveness of our approach against other well-established methods.
Supplemental Material
- M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757--1771, 2004.Google ScholarCross Ref
- C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google Scholar
- W. Cheng and E. Hullermeier. Combining instance-based learning and logistic regression for multilabel classification. Machine Learning, 76(2-3):211--225, 2009. Google ScholarDigital Library
- A. Clare and R. D. King. Knowledge discovery in multi-label phenotype data. In L. D. Raedt and A. Siebes, editors, Lecture Notes in Computer Science 2168, pages 42--53. Springer, Berlin, 2001. Google ScholarDigital Library
- F. D. Comite, R. Gilleron, and M. Tommasi. Learning multi-label altenating decision tree from texts and data. In P. Perner and A. Rosenfeld, editors, Lecture Notes in Computer Science 2734, pages 35--49. Springer, Berlin, 2003. Google ScholarDigital Library
- T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, New York, NY, 1991. Google ScholarDigital Library
- A. Elisseeff and J. Weston. A kernel method for multi-labelled classification. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 681--687. MIT Press, Cambridge, MA, 2002.Google Scholar
- J. Furnkranz, E. Hullermeier, E. L. Menc?a, and K. Brinker. Multilabel classification via calibrated label ranking. Machine Learning, 73(2):133--153, 2008. Google ScholarDigital Library
- N. Ghamrawi and A. McCallum. Collective multi-label classification. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pages 195--200, Bremen, Germany, 2005. Google ScholarDigital Library
- S. Godbole and S. Sarawagi. Discriminative methods for multi-labeled classification. In H. Dai, R. Srikant, and C. Zhang, editors, Lecture Notes in Artificial Intelligence 3056, pages 22--30. Springer, Berlin, 2004.Google Scholar
- S. Ji, L. Tang, S. Yu, and J. Ye. Extracting shared subspace for multi-label classification. In Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 381--389, Las Vegas, NV, 2008. Google ScholarDigital Library
- M. Koivisto. Advances in exact bayesian structure discovery in bayesian networks. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, pages 241--248, Menlo Park, CA, 2006. AUAI Press.Google Scholar
- D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA, 2009. Google ScholarDigital Library
- A. McCallum. Multi-label text classification with a mixture model trained by EM. In Working Notes of the AAAI'99 Workshop on Text Learning, Orlando, FL, 1999.Google Scholar
- K. Murphy. Software packages for graphical models / bayesian networks. International Society for Bayesian Analysis, 2007.Google Scholar
- G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In Proceedings of the 15th ACM International Conference on Multimedia, pages 17--26, Augsburg, Germany, 2007. Google ScholarDigital Library
- J. Read, B. Pfahringer, and G. Holmes. Multi-label classification using ensembles of pruned sets. In Proceedings of the 9th IEEE International Conference on Data Mining, pages 995--1000, Pisa, Italy, 2008. Google ScholarDigital Library
- J. Read, B. Pfahringer, G. Holmes, and E. Frank. Classifier chains for multi-label classification. In W. Buntine, M. Grobelnik, and J. Shawe-Taylor, editors, Lecture Notes in Artificial Intelligence 5782, pages 254--269. Springer, Berlin, 2009.Google Scholar
- R. E. Schapire and Y. Singer. Boostexter: a boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000. Google ScholarDigital Library
- V. Smith, J. Yu, T. Smulders, A. Hartemink, and E. Jarvis. Computational inference of neural information flow networks. PLoS Computational Biology, 2(11):1436--1449, 2006.Google ScholarCross Ref
- G. Tsoumakas, I. Katakis, and I. Vlahavas. Mining multi-label data. In O. Maimon and L. Rokach, editors, Data Mining and Knowledge Discovery Handbook. Springer, Berlin, 2010.Google Scholar
- G. Tsoumakas and I. Vlahavas. Random k-labelsets: an ensemble method for multilabel classification. In J. N. Kok, J. Koronacki, R. L. de Mantaras, S. Matwin, D. Mladenic, and A. Skowron, editors, Lecture Notes in Artificial Intelligence 4701, pages 406--417. Springer, Berlin, 2007. Google ScholarDigital Library
- G. Tsoumakas, M.-L. Zhang, and Z.-H. Zhou. Tutorial on learning from multi-label data {http://www.ecml pkdd2009.net/wp-content/uploads/2009/08/learningfrom- multi-label-data.pdf}. In ECML/PKDD 2009, Bled, Slovenia, 2009.Google Scholar
- N. Ueda and K. Saito. Parametric mixture models for multi-label text. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 721--728. MIT Press, Cambridge, MA, 2003.Google Scholar
- R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In Proceedings of the 13th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 834--843, San Jose, CA, 2007. Google ScholarDigital Library
- Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, pages 412--420, Nashville, TN, 1997. Google ScholarDigital Library
- K. Zhang and A. Hyvarinen. Causality discovery with additive disturbances: An information-theoretical perspective. In W. Buntine, M. Grobelnik, and J. Shawe-Taylor, editors, Lecture Notes in Artificial Intelligence 5782, pages 570--585. Springer, Berlin, 2009.Google Scholar
- M.-L. Zhang and Z.-H. Zhou. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10):1338--1351, 2006. Google ScholarDigital Library
- M.-L. Zhang and Z.-H. Zhou. ML-kNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038--2048, 2007. Google ScholarDigital Library
- S. Zhu, X. Ji, W. Xu, and Y. Gong. Multi-labelled classification using maximum entropy method. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 274--281, Salvador, Brazil, 2005. Google ScholarDigital Library
Index Terms
- Multi-label learning by exploiting label dependency
Recommendations
Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data MiningIn multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...
Semi-supervised multi-label classification using incomplete label information
Highlights- An inductive semi-supervised method called Smile is proposed for multi-label classification using incomplete label information.
AbstractClassifying multi-label instances using incompletely labeled instances is one of the fundamental tasks in multi-label learning. Most existing methods regard this task as supervised weak-label learning problem and assume sufficient ...
Transductive Multilabel Learning via Label Set Propagation
The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Comments