ABSTRACT
Common approaches to multi-label classification learn independent classifiers for each category, and employ ranking or thresholding schemes for classification. Because they do not exploit dependencies between labels, such techniques are only well-suited to problems in which categories are independent. However, in many domains labels are highly interdependent. This paper explores multi-label conditional random field (CRF)classification models that directly parameterize label co-occurrences in multi-label classification. Experiments show that the models outperform their single-label counterparts on standard text corpora. Even when multi-labels are sparse, the models improve subset classification error by as much as 40%.
- A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39--71, 1996. Google ScholarDigital Library
- M. Boutell, X. Shen, J. Luo, and C. Brown. Multi-label semantic scene classification, technical report, dept. comp.sci. u. rochester. 2003.Google Scholar
- R. H. Byrd, J. Nocedal, and R. B. Schnabel. Representations of quasi-newton matrices and their use in limited memory methods. Mathematical Programming, 63:129--156, 1994. Google ScholarCross Ref
- S. F. Chen and R. Rosenfeld. A gaussian prior for smoothing maximum entropy models, technical report cmucs -99--108, carnegie mellon university. 1999.Google Scholar
- K. Crammer and Y. Singer. A new family of online algorithms for category ranking. In SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 151--158, Tampere, Finland, 2002. ACM. Google ScholarDigital Library
- S. Gao, W. Wu, C.-H. Lee, and T.-S. Chua. A mfom learning approach to robust multiclass multi-label text categorization. In Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, 2004. ACM. Google ScholarDigital Library
- W. R. Hersh, C. Buckley, T. J. Leone, and D. H. Hickam. Ohsumed: An interactive retrieval evaluation and new large test collection for research. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 192--201, Dublin, Ireland, July 1994. ACM/Springer. Google ScholarDigital Library
- T. Joachims. Text categorization with suport vector machines: Learning with many relevant features. In Machine Learning: ECML-98, 10th European Conference on Machine Learning, volume 1398 of Lecture Notes in Computer Science, pages 137--142. Springer, April 1998. Google ScholarDigital Library
- J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pages 282--289, Williamstown, MA, USA, 2001. Morgan Kaufmann. Google ScholarDigital Library
- H. A. Loeliger. An introduction to factor graphs. In IEEE Signal Processing Magazine, pages 28--41, January, 2004.Google ScholarCross Ref
- A. McCallum. Multi-label text classification with a mixture model trained by EM. In AAAI'99 Workshop on Text Learning, 1999.Google Scholar
- A. McCallum. Efficiently inducing features of conditional random fields. In UAI'03, Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence (UAI), pages 403--410, Acapulco, Mexico, August 2003. Morgan Kaufmann. Google Scholar
- K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In In IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.Google Scholar
- R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000. Google ScholarDigital Library
- B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In UAI '02, Proceedings of the 18th Conference in Uncertainty in Artificial Intelligence, University of Alberta, Edmonton, Alberta, Canada, August 1-4, 2002, pages 485--492, Edmonton, Alberta, Canada, August 2002. Morgan Kaufmann. Google ScholarDigital Library
- N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In Advances in Neural Information Processing Systems 15 {Neural Information Processing Systems, NIPS 2002, pages 721--728, Vancouver, British Columbia, Canada, December 2002. MIT Press.Google Scholar
- Y. Yang. An evaluation of statistical approaches to text categorization. Inf. Retr., 1(1--2):69--90, 1999. Google ScholarDigital Library
- Collective multi-label classification
Recommendations
The use of data-derived label hierarchies in multi-label classification
Instead of traditional (multi-class) learning approaches that assume label independency, multi-label learning approaches must deal with the existing label dependencies and relations. Many approaches try to model these dependencies in the process of ...
A Novel Online Stacked Ensemble for Multi-Label Stream Classification
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge ManagementAs data streams become more prevalent, the necessity for online algorithms that mine this transient and dynamic data becomes clearer. Multi-label data stream classification is a supervised learning problem where each instance in the data stream is ...
Set Labelling using Multi-label Classification
iiWAS2018: Proceedings of the 20th International Conference on Information Integration and Web-based Applications & ServicesWe propose the task of set labelling. Starting from some examples members of a set, set labelling tries to infer the most appropriate labels for the given set. For this work, we consider sets of words. We illustrate the task and a possible solution with ...
Comments