Article

Feature selection using linear classifier weights: interaction with classification models

Authors:
Dunja Mladenić

Jožef Stefan Institute, Ljubljana, Slovenia

Jožef Stefan Institute, Ljubljana, Slovenia
View Profile

,
Janez Brank

Jožef Stefan Institute, Ljubljana, Slovenia

Jožef Stefan Institute, Ljubljana, Slovenia
View Profile

,
Marko Grobelnik

Jožef Stefan Institute, Ljubljana, Slovenia

Jožef Stefan Institute, Ljubljana, Slovenia
View Profile

,
Natasa Milic-Frayling

Microsoft Research Ltd, Cambridge, United Kingdom

Microsoft Research Ltd, Cambridge, United Kingdom
View Profile

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrievalJuly 2004Pages 234–241https://doi.org/10.1145/1008992.1009034

Published:25 July 2004Publication History

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 234–241

ABSTRACT

This paper explores feature scoring and selection based on weights from linear classification models. It investigates how these methods combine with various learning models. Our comparative analysis includes three learning algorithms: Naïve Bayes, Perceptron, and Support Vector Machines (SVM) in combination with three feature weighting methods: Odds Ratio, Information Gain, and weights from linear models, the linear SVM and Perceptron. Experiments show that feature selection using weights from linear SVMs yields better classification performance than other feature weighting methods when combined with the three explored learning algorithms. The results support the conjecture that it is the sophistication of the feature weighting method rather than its apparent compatibility with the learning algorithm that improves classification performance.

References

Janez Brank, Marko Grobelnik, Nataša Milić-Frayling, and Dunja Mladenić. Feature selection using support vector machines. Proc. of the 3rd Int. Conf. on Data Mining Methods and Databases for Engineering, Finance, and Other Fields, Bologna, Italy, September 2002.Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning, 20(3):273--297, September 1995. Google ScholarDigital Library
Thorsten Joachims. (1999). Making large-scale support vector machine learning practical. In B. Schölkopf et al. (Eds.), Advances in kernel methods: Support vector learning. MIT Press, 1999, pp. 169--184. Google ScholarDigital Library
Werner Krauth and Marc Mézard. Learning algorithms with optimal stability in neural networks. Jour. Physics A 20, L745-L752, August 1987.Google ScholarCross Ref
Andrew McCallum and Kamal Nigam. A comparison of event models for Naïve Bayes text categorization. AAAI Workshop on Learning for Text Categorization (pp. 41--48). AAAI Press, 1998.Google Scholar
Dunja Mladenić and Marko Grobelnik. Feature selection for unbalanced class distribution and Naïve Bayes. Proc. 16th Int. Conf. on Mach. Learning. San Francisco: Morgan Kaufmann, pp. 258--267, 1999. Google ScholarDigital Library
J. Ross Quinlan. Constructing decision trees. In: C4.5: Programs for machine learning, pp. 17--26. Morgan Kaufmann, 1993.Google ScholarCross Ref
Frank Rosenblatt. The Perceptron: A probabilistic model for information storage and organization in the brain. Psych. Review 65(6), 386--408. Reprinted in: J. A. D. Anderson, E. Rosenfeld (Eds.), Neurocomputing: foundations of research. Cambridge, MA: MIT Press, 1988. Google ScholarDigital Library
Vikas Sindhwani, Pushpak Bhattacharya, and Subrata Rakshit. Information theoretic feature crediting in multiclass Support Vector Machines. 1st SIAM Int. Conf. on Data Mining (SDM 2001), Chicago, IL, USA, April 5-7, 2001. SIAM, 2001.Google ScholarCross Ref
Lawrence Shih, Yu-Han Chang, Jason Rennie, David Karger. Not too hot, not too cold: The Bundled-SVM is just right! Workshop on Text Learning (TextML-2002), ICML, Sydney, Australia, July 8, 2002.Google Scholar
Soumen Chakrabarti, Shourya Roy, Mahesh V. Soundalgekar: Fast and accurate text classification via multiple linear discriminant projections. Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, August 20--23, 2002, pp. 658--669. Google ScholarDigital Library

Index Terms

Feature selection using linear classifier weights: interaction with classification models
1. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Using ambiguity measure feature selection algorithm for support vector machine classifier
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

With the ever-increasing number of documents on the web, digital libraries, news sources, etc., the need of a text classifier that can classify massive amount of data is becoming more critical and difficult. The major problem in text classification is ...
Read More
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Read More
Feature selection for text classification with Naïve Bayes

As an important preprocessing technology in text classification, feature selection can improve the scalability, efficiency and accuracy of a text classifier. In general, a good feature selection method should consider domain and algorithm ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
July 2004
624 pages
ISBN:1581138814
DOI:10.1145/1008992
General Chair:
Mark Sanderson
University of Sheffield (UK)
,
Program Chairs:
Kalervo Järvelin
University of Tampere (Finland)
,
James Allan
University of Massachusetts (USA)
,
Peter Bruza
Distributed Systems Technology Centre (Australia)
Copyright © 2004 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2004
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SVM normal
feature scoring
feature selection
information retrieval
linear SVM
text classification
vector representation
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 119
  Total Citations
  View Citations
- 2,505
  Total Downloads
- Downloads (Last 12 months)59
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Feature selection using linear classifier weights: interaction with classification models

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Using ambiguity measure feature selection algorithm for support vector machine classifier

Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Feature selection for text classification with Naïve Bayes