Skip to main content
Log in

A noise-aware feature selection approach for classification

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

A noise-aware version of support vector machines is utilized for feature selection in this paper. Combining this method and sequential backward search (SBS), a new algorithm for removing irrelevant features is proposed. Although feature selection methods in the literature which utilize support vector machines have provided acceptable results, noisy samples and outliers may affect the performance of SVM and feature selections method, consequently. Recently, we have proposed relaxed constrains SVM (RSVM) which handles noisy data and outliers. Each training sample in RSVM is associated with a degree of importance utilizing the fuzzy c-means clustering method. Therefore, a less importance degree is assigned to noisy data and outliers. Moreover, RSVM has more relaxed constraints that can reduce the effect of noisy samples. Feature selection increases the accuracy of different machine learning applications by eliminating noisy and irrelevant features. In the proposed RSVM-SBS feature selection algorithm, noisy data have small effect on eliminating irrelevant features. Experimental results using real-world data verify that RSVM-SBS has better results in comparison with other feature selection approaches utilizing support vector machines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570

    Article  Google Scholar 

  • Abdoos A, Khorshidian Mianaei P, Rayatpanah Ghadikolaei M (2016) Combined VMD-SVM based feature selection method for classification of power quality events. Appl Soft Comput 38:637–646

    Article  Google Scholar 

  • Aljarah I, Al-Zoubi A, Haris F, Hassonah M, Mirjalili S, Saadeh H (2018) Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cognit Comput 10:478–495

    Article  Google Scholar 

  • Benítez-Peña S, Blanquero R, Carrizosa E, Ramírez-Cobo P (2019) Cost-sensitive feature selection for support vector machines. Comput Oper Res 106:169–178

    Article  MathSciNet  Google Scholar 

  • Blake C, Keogh E, Merz CJ (1998) UCI repository of machine learning databases. Technical report, Department of Information and Computer Science, University of California, Irvine, CA, http://www.ics.uci.edu/∼mlearn/MLRepository.htm

  • Blum A, Langley PP (1997) Selection of relevant features and examples in machine learning. Artif Intell 97:245–271

    Article  MathSciNet  Google Scholar 

  • Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:13–156

    Article  Google Scholar 

  • Duda PEHRO, Stork DG (2001) Pattern classification. Wiley-Interscience Publication, Hoboken

    MATH  Google Scholar 

  • Faris H, Al-Zoubi A, Heidari A, Aljarah I, Mafarja M, Hassonah M, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83

    Article  Google Scholar 

  • GhasemiGol M, Sabzekar M, Monsefi R, Naghibzadeh M, Sadoghi Yazdi H (2010) Support vector data description with fuzzy constraints. In: First international conference on intelligent systems, modelling and simulation (ISMS), pp 10–14, Liverpool, England

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119

    Article  Google Scholar 

  • He Q, Wu C (2011) Membership evaluation and feature selection for fuzzy support vector machine based on fuzzy rough sets. Soft Comput 15:1105–1114

    Article  Google Scholar 

  • Kursa M, Rudnicki W (2010) Feature selection with the Boruta package. J Stat Softw 36:1–13

    Article  Google Scholar 

  • Liu Y, Zheng YF (2006) FS-SFS: a novel feature selection method for support vector machines. Pattern Recogn 39:1333–1345

    Article  Google Scholar 

  • Lu M (2019) Embedded feature selection accounting for unknown data heterogeneity. Expert Syst Appl 119:350–361

    Article  Google Scholar 

  • Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453

    Article  Google Scholar 

  • Mafarja M, Aljarah I, Heidari A, Hammouri A, Faris H, Al-Zoubi A, Mirjalali S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl Based Syst 145:25–45

    Article  Google Scholar 

  • Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179:2208–2217

    Article  Google Scholar 

  • Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181:115–128

    Article  Google Scholar 

  • Maldonadoa S, López J (2018) Dealing with high-dimensional class-imbalanced datasets: embedded feature selection for SVM classification. Appl Soft Comput 67:94–105

    Article  Google Scholar 

  • Marill T, Green DM (1963) On the effectiveness of receptors in recognition system. IEEE Trans Inf Theory 9:11–17

    Article  Google Scholar 

  • Mundra PA, Rajapakse JC (2010) SVM-RFE with MRMR filter for gene selection. IEEE Trans Nanobiosci 9(1):31–37

    Article  Google Scholar 

  • Nasiri JA, Sabzekar M, Sadoghi Yazdi H, Naghibzadeh M, Naghibzadeh B (2009) Intelligent arrhythmia detection using genetic algorithm and emphatic SVM (ESVM). In: Third UKSim European symposium on computer modeling and simulation (EMS), pp 112–117, Athens, Greece

  • Neumann J, Schnorr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61:129–150

    Article  Google Scholar 

  • Pławiak P, Abdar M, Acharya UR (2019) Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Appl Soft Comput 84:105740

    Article  Google Scholar 

  • Sabzekar M, Naghibzadeh M (2013a) Fuzzy c-means improvement using relaxed constraints support vector machines. Appl Soft Comput 13:881–890

    Article  Google Scholar 

  • Sabzekar M, Naghibzadeh M (2013b) Fuzzy c-means improvement using relaxed constraints support vector machines. Appl Soft Comput 13(2):881–890

    Article  Google Scholar 

  • Sabzekar M, Naghibzadeh M, Sadoghi Yazdi H, Effati S (2009) Emphatic constraints support vector machines for multiclass classification. In: Third UKSim European symposium on computer modeling and simulation (EMS), pp 118–123, Athens, Greece

  • Sabzekar M, Sadoghi Yazdi H, Naghibzadeh M (2011) Relaxed constraints support vector machines for noisy data. Neural Comput Appl 20:671–685

    Article  Google Scholar 

  • Sabzekar M, Hossein Yaghmaee Moghaddam M, Naghibzadeh M (2013) TCP traffic classification using relaxed constraints support vector machines. In: Fathi M (ed) Integration of practice-oriented knowledge technology: trends and prospectives. ISBN 978–3–642–34470–1, pp 129–141

  • Shieh MD, Yang CC (2008) Multiclass SVM-RFE for product form feature selection. Expert Syst Appl 35:531–541

    Article  Google Scholar 

  • Tang W (2010) Feature selection using hybrid Taguchi genetic algorithm and fuzzy support vector machine. In: Sixth international conference on natural computation, pp 2348–2355

  • Torres-Valencia C, Álvarez-López M, Orozco-Gutiérrez Á (2017) SVM-based feature selection methods for emotion recognition from multimodal data. J Multimodal User Interfaces 11:9–23

    Article  Google Scholar 

  • Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    Book  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Xia H (2008) Feature selection based on fuzzy SVM. In: Fifth international conference on fuzzy systems and knowledge discovery (FSKD), vol 1, pp 586–589

  • Xia H, Hu BQ (2006) Feature selection using fuzzy support vector machines. Fuzzy Optim Decis Mak 5:187–192

    Article  MathSciNet  Google Scholar 

  • Xiong W, Wang C (2008) Feature selection: a hybrid approach based on self-adaptive ant colony and support vector machine. In: International conference on computer science and software engineering, pp 751–754

  • Yan C, Ma J, Luo H, Patel A (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemom Intell Lab Syst 184:102–111

    Article  Google Scholar 

  • Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72

    Article  Google Scholar 

  • Zaman S, Karray F (2009) Features selection for intrusion detection systems based on support vector machines. In: 2009 6th IEEE consumer communications and networking conference, pp 1–8

  • Zheng K, Wang X (2018) Feature selection method with joint maximal information entropy between features and class. Pattern Recogn 77:20–29

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mostafa Sabzekar.

Ethics declarations

Conflict of interest

The authors declared no conflicts of interest with respect to the authorship and/or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sabzekar, M., Aydin, Z. A noise-aware feature selection approach for classification. Soft Comput 25, 6391–6400 (2021). https://doi.org/10.1007/s00500-021-05630-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-05630-7

Keywords

Navigation