ABSTRACT
This paper explores feature scoring and selection based on weights from linear classification models. It investigates how these methods combine with various learning models. Our comparative analysis includes three learning algorithms: Naïve Bayes, Perceptron, and Support Vector Machines (SVM) in combination with three feature weighting methods: Odds Ratio, Information Gain, and weights from linear models, the linear SVM and Perceptron. Experiments show that feature selection using weights from linear SVMs yields better classification performance than other feature weighting methods when combined with the three explored learning algorithms. The results support the conjecture that it is the sophistication of the feature weighting method rather than its apparent compatibility with the learning algorithm that improves classification performance.
- Janez Brank, Marko Grobelnik, Nataša Milić-Frayling, and Dunja Mladenić. Feature selection using support vector machines. Proc. of the 3rd Int. Conf. on Data Mining Methods and Databases for Engineering, Finance, and Other Fields, Bologna, Italy, September 2002.Google Scholar
- Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning, 20(3):273--297, September 1995. Google ScholarDigital Library
- Thorsten Joachims. (1999). Making large-scale support vector machine learning practical. In B. Schölkopf et al. (Eds.), Advances in kernel methods: Support vector learning. MIT Press, 1999, pp. 169--184. Google ScholarDigital Library
- Werner Krauth and Marc Mézard. Learning algorithms with optimal stability in neural networks. Jour. Physics A 20, L745-L752, August 1987.Google ScholarCross Ref
- Andrew McCallum and Kamal Nigam. A comparison of event models for Naïve Bayes text categorization. AAAI Workshop on Learning for Text Categorization (pp. 41--48). AAAI Press, 1998.Google Scholar
- Dunja Mladenić and Marko Grobelnik. Feature selection for unbalanced class distribution and Naïve Bayes. Proc. 16th Int. Conf. on Mach. Learning. San Francisco: Morgan Kaufmann, pp. 258--267, 1999. Google ScholarDigital Library
- J. Ross Quinlan. Constructing decision trees. In: C4.5: Programs for machine learning, pp. 17--26. Morgan Kaufmann, 1993.Google ScholarCross Ref
- Frank Rosenblatt. The Perceptron: A probabilistic model for information storage and organization in the brain. Psych. Review 65(6), 386--408. Reprinted in: J. A. D. Anderson, E. Rosenfeld (Eds.), Neurocomputing: foundations of research. Cambridge, MA: MIT Press, 1988. Google ScholarDigital Library
- Vikas Sindhwani, Pushpak Bhattacharya, and Subrata Rakshit. Information theoretic feature crediting in multiclass Support Vector Machines. 1st SIAM Int. Conf. on Data Mining (SDM 2001), Chicago, IL, USA, April 5-7, 2001. SIAM, 2001.Google ScholarCross Ref
- Lawrence Shih, Yu-Han Chang, Jason Rennie, David Karger. Not too hot, not too cold: The Bundled-SVM is just right! Workshop on Text Learning (TextML-2002), ICML, Sydney, Australia, July 8, 2002.Google Scholar
- Soumen Chakrabarti, Shourya Roy, Mahesh V. Soundalgekar: Fast and accurate text classification via multiple linear discriminant projections. Proceedings of the 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, August 20--23, 2002, pp. 658--669. Google ScholarDigital Library
Index Terms
- Feature selection using linear classifier weights: interaction with classification models
Recommendations
Using ambiguity measure feature selection algorithm for support vector machine classifier
SAC '08: Proceedings of the 2008 ACM symposium on Applied computingWith the ever-increasing number of documents on the web, digital libraries, news sources, etc., the need of a text classifier that can classify massive amount of data is becoming more critical and difficult. The major problem in text classification is ...
Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values
Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Feature selection for text classification with Naïve Bayes
As an important preprocessing technology in text classification, feature selection can improve the scalability, efficiency and accuracy of a text classifier. In general, a good feature selection method should consider domain and algorithm ...
Comments