Article

Collective multi-label classification

Authors:
Nadia Ghamrawi

University of Massachusetts - Amherst, Amherst, MA

University of Massachusetts - Amherst, Amherst, MA
View Profile

,
Andrew McCallum

University of Massachusetts - Amherst, Amherst, MA

University of Massachusetts - Amherst, Amherst, MA
View Profile

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementOctober 2005Pages 195–200https://doi.org/10.1145/1099554.1099591

Published:31 October 2005Publication History

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

Pages 195–200

ABSTRACT

Common approaches to multi-label classification learn independent classifiers for each category, and employ ranking or thresholding schemes for classification. Because they do not exploit dependencies between labels, such techniques are only well-suited to problems in which categories are independent. However, in many domains labels are highly interdependent. This paper explores multi-label conditional random field (CRF)classification models that directly parameterize label co-occurrences in multi-label classification. Experiments show that the models outperform their single-label counterparts on standard text corpora. Even when multi-labels are sparse, the models improve subset classification error by as much as 40%.

References

A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39--71, 1996. Google ScholarDigital Library
M. Boutell, X. Shen, J. Luo, and C. Brown. Multi-label semantic scene classification, technical report, dept. comp.sci. u. rochester. 2003.Google Scholar
R. H. Byrd, J. Nocedal, and R. B. Schnabel. Representations of quasi-newton matrices and their use in limited memory methods. Mathematical Programming, 63:129--156, 1994. Google ScholarCross Ref
S. F. Chen and R. Rosenfeld. A gaussian prior for smoothing maximum entropy models, technical report cmucs -99--108, carnegie mellon university. 1999.Google Scholar
K. Crammer and Y. Singer. A new family of online algorithms for category ranking. In SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 151--158, Tampere, Finland, 2002. ACM. Google ScholarDigital Library
S. Gao, W. Wu, C.-H. Lee, and T.-S. Chua. A mfom learning approach to robust multiclass multi-label text categorization. In Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, 2004. ACM. Google ScholarDigital Library
W. R. Hersh, C. Buckley, T. J. Leone, and D. H. Hickam. Ohsumed: An interactive retrieval evaluation and new large test collection for research. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 192--201, Dublin, Ireland, July 1994. ACM/Springer. Google ScholarDigital Library
T. Joachims. Text categorization with suport vector machines: Learning with many relevant features. In Machine Learning: ECML-98, 10th European Conference on Machine Learning, volume 1398 of Lecture Notes in Computer Science, pages 137--142. Springer, April 1998. Google ScholarDigital Library
J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pages 282--289, Williamstown, MA, USA, 2001. Morgan Kaufmann. Google ScholarDigital Library
H. A. Loeliger. An introduction to factor graphs. In IEEE Signal Processing Magazine, pages 28--41, January, 2004.Google ScholarCross Ref
A. McCallum. Multi-label text classification with a mixture model trained by EM. In AAAI'99 Workshop on Text Learning, 1999.Google Scholar
A. McCallum. Efficiently inducing features of conditional random fields. In UAI'03, Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence (UAI), pages 403--410, Acapulco, Mexico, August 2003. Morgan Kaufmann. Google Scholar
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In In IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.Google Scholar
R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000. Google ScholarDigital Library
B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In UAI '02, Proceedings of the 18th Conference in Uncertainty in Artificial Intelligence, University of Alberta, Edmonton, Alberta, Canada, August 1-4, 2002, pages 485--492, Edmonton, Alberta, Canada, August 2002. Morgan Kaufmann. Google ScholarDigital Library
N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In Advances in Neural Information Processing Systems 15 {Neural Information Processing Systems, NIPS 2002, pages 721--728, Vancouver, British Columbia, Canada, December 2002. MIT Press.Google Scholar
Y. Yang. An evaluation of statistical approaches to text categorization. Inf. Retr., 1(1--2):69--90, 1999. Google ScholarDigital Library

Collective multi-label classification
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches

Recommendations

The use of data-derived label hierarchies in multi-label classification

Instead of traditional (multi-class) learning approaches that assume label independency, multi-label learning approaches must deal with the existing label dependencies and relations. Many approaches try to model these dependencies in the process of ...
Read More
A Novel Online Stacked Ensemble for Multi-Label Stream Classification
CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management

As data streams become more prevalent, the necessity for online algorithms that mine this transient and dynamic data becomes clearer. Multi-label data stream classification is a supervised learning problem where each instance in the data stream is ...
Read More
Set Labelling using Multi-label Classification
iiWAS2018: Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services

We propose the task of set labelling. Starting from some examples members of a set, set labelling tries to infer the most appropriate labels for the given set. For this work, we consider sets of words. We illustrate the task and a possible solution with ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
General Chair:
Otthein Herzog
University of Bremen, Germany
,
Program Chairs:
Hans-Jörg Schek
University for Health Sciences, Medical Informatics and Technology, Austria
,
Norbert Fuhr
University of Duisburg-Essen, Germany
,
Abdur Chowdhury
America Online, USA
,
Wilfried Teiken
IBM T.J. Watson Research Center, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 October 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
classification
machine learning
multi-label
statistical learning
uncertainty
Qualifiers
- Article
Conference

Acceptance Rates
CIKM '05 Paper Acceptance Rate77of425submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 216
  Total Citations
  View Citations
- 3,318
  Total Downloads
- Downloads (Last 12 months)190
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Collective multi-label classification

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Recommendations

The use of data-derived label hierarchies in multi-label classification

A Novel Online Stacked Ensemble for Multi-Label Stream Classification

Set Labelling using Multi-label Classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Collective multi-label classification

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Recommendations

The use of data-derived label hierarchies in multi-label classification

A Novel Online Stacked Ensemble for Multi-Label Stream Classification

Set Labelling using Multi-label Classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media