skip to main content
10.1145/1835804.1835930acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Multi-label learning by exploiting label dependency

Published:25 July 2010Publication History

ABSTRACT

In multi-label learning, each training example is associated with a set of labels and the task is to predict the proper label set for the unseen example. Due to the tremendous (exponential) number of possible label sets, the task of learning from multi-label examples is rather challenging. Therefore, the key to successful multi-label learning is how to effectively exploit correlations between different labels to facilitate the learning process. In this paper, we propose to use a Bayesian network structure to efficiently encode the conditional dependencies of the labels as well as the feature set, with the feature set as the common parent of all labels. To make it practical, we give an approximate yet efficient procedure to find such a network structure. With the help of this network, multi-label learning is decomposed into a series of single-label classification problems, where a classifier is constructed for each label by incorporating its parental labels as additional features. Label sets of unseen examples are predicted recursively according to the label ordering given by the network. Extensive experiments on a broad range of data sets validate the effectiveness of our approach against other well-established methods.

Skip Supplemental Material Section

Supplemental Material

kdd2010_zhang_mlle_01.mov

mov

119.8 MB

References

  1. M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. Learning multi-label scene classification. Pattern Recognition, 37(9):1757--1771, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  2. C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.Google ScholarGoogle Scholar
  3. W. Cheng and E. Hullermeier. Combining instance-based learning and logistic regression for multilabel classification. Machine Learning, 76(2-3):211--225, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Clare and R. D. King. Knowledge discovery in multi-label phenotype data. In L. D. Raedt and A. Siebes, editors, Lecture Notes in Computer Science 2168, pages 42--53. Springer, Berlin, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. D. Comite, R. Gilleron, and M. Tommasi. Learning multi-label altenating decision tree from texts and data. In P. Perner and A. Rosenfeld, editors, Lecture Notes in Computer Science 2734, pages 35--49. Springer, Berlin, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, New York, NY, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Elisseeff and J. Weston. A kernel method for multi-labelled classification. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems 14, pages 681--687. MIT Press, Cambridge, MA, 2002.Google ScholarGoogle Scholar
  8. J. Furnkranz, E. Hullermeier, E. L. Menc?a, and K. Brinker. Multilabel classification via calibrated label ranking. Machine Learning, 73(2):133--153, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N. Ghamrawi and A. McCallum. Collective multi-label classification. In Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pages 195--200, Bremen, Germany, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Godbole and S. Sarawagi. Discriminative methods for multi-labeled classification. In H. Dai, R. Srikant, and C. Zhang, editors, Lecture Notes in Artificial Intelligence 3056, pages 22--30. Springer, Berlin, 2004.Google ScholarGoogle Scholar
  11. S. Ji, L. Tang, S. Yu, and J. Ye. Extracting shared subspace for multi-label classification. In Proceedings of the 14th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 381--389, Las Vegas, NV, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. Koivisto. Advances in exact bayesian structure discovery in bayesian networks. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, pages 241--248, Menlo Park, CA, 2006. AUAI Press.Google ScholarGoogle Scholar
  13. D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge, MA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. McCallum. Multi-label text classification with a mixture model trained by EM. In Working Notes of the AAAI'99 Workshop on Text Learning, Orlando, FL, 1999.Google ScholarGoogle Scholar
  15. K. Murphy. Software packages for graphical models / bayesian networks. International Society for Bayesian Analysis, 2007.Google ScholarGoogle Scholar
  16. G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In Proceedings of the 15th ACM International Conference on Multimedia, pages 17--26, Augsburg, Germany, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. Read, B. Pfahringer, and G. Holmes. Multi-label classification using ensembles of pruned sets. In Proceedings of the 9th IEEE International Conference on Data Mining, pages 995--1000, Pisa, Italy, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Read, B. Pfahringer, G. Holmes, and E. Frank. Classifier chains for multi-label classification. In W. Buntine, M. Grobelnik, and J. Shawe-Taylor, editors, Lecture Notes in Artificial Intelligence 5782, pages 254--269. Springer, Berlin, 2009.Google ScholarGoogle Scholar
  19. R. E. Schapire and Y. Singer. Boostexter: a boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. Smith, J. Yu, T. Smulders, A. Hartemink, and E. Jarvis. Computational inference of neural information flow networks. PLoS Computational Biology, 2(11):1436--1449, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  21. G. Tsoumakas, I. Katakis, and I. Vlahavas. Mining multi-label data. In O. Maimon and L. Rokach, editors, Data Mining and Knowledge Discovery Handbook. Springer, Berlin, 2010.Google ScholarGoogle Scholar
  22. G. Tsoumakas and I. Vlahavas. Random k-labelsets: an ensemble method for multilabel classification. In J. N. Kok, J. Koronacki, R. L. de Mantaras, S. Matwin, D. Mladenic, and A. Skowron, editors, Lecture Notes in Artificial Intelligence 4701, pages 406--417. Springer, Berlin, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Tsoumakas, M.-L. Zhang, and Z.-H. Zhou. Tutorial on learning from multi-label data {http://www.ecml pkdd2009.net/wp-content/uploads/2009/08/learningfrom- multi-label-data.pdf}. In ECML/PKDD 2009, Bled, Slovenia, 2009.Google ScholarGoogle Scholar
  24. N. Ueda and K. Saito. Parametric mixture models for multi-label text. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems 15, pages 721--728. MIT Press, Cambridge, MA, 2003.Google ScholarGoogle Scholar
  25. R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In Proceedings of the 13th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 834--843, San Jose, CA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning, pages 412--420, Nashville, TN, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. K. Zhang and A. Hyvarinen. Causality discovery with additive disturbances: An information-theoretical perspective. In W. Buntine, M. Grobelnik, and J. Shawe-Taylor, editors, Lecture Notes in Artificial Intelligence 5782, pages 570--585. Springer, Berlin, 2009.Google ScholarGoogle Scholar
  28. M.-L. Zhang and Z.-H. Zhou. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10):1338--1351, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M.-L. Zhang and Z.-H. Zhou. ML-kNN: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038--2048, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Zhu, X. Ji, W. Xu, and Y. Gong. Multi-labelled classification using maximum entropy method. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 274--281, Salvador, Brazil, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-label learning by exploiting label dependency

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
      July 2010
      1240 pages
      ISBN:9781450300551
      DOI:10.1145/1835804

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader